AI ASAP: ChatGPT, Claude, Midjourney, Flux & More | Arnold Oberleiter | Skillshare
Search

Playback Speed


1.0x


  • 0.5x
  • 0.75x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 1.75x
  • 2x

AI ASAP: ChatGPT, Claude, Midjourney, Flux & More

teacher avatar Arnold Oberleiter

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

    • 1.

      Introduction

      1:08

    • 2.

      What is AI?

      3:13

    • 3.

      What are LLMs like ChatGPT, Claude, Gemini, etc

      13:54

    • 4.

      The Interfaces of LLMs

      8:39

    • 5.

      What can LLMs do?

      8:20

    • 6.

      Prompt Engineering

      10:59

    • 7.

      More Prompt Engineering Tips

      9:36

    • 8.

      Customizing LLMs with System Prompts and RAG (Retrieval Augmented Generation)

      9:19

    • 9.

      Perplexity and Huggingchat

      1:07

    • 10.

      Developers Can Use LLMs via OpenAI API

      7:11

    • 11.

      Recap of LLMs

      2:57

    • 12.

      The Diffusion Model Explained

      5:38

    • 13.

      Prompt Engineering for Diffusion Models: Starting with DALL E

      7:15

    • 14.

      Midjourney Basics

      10:09

    • 15.

      Ideogram and Adobe Firefly

      7:55

    • 16.

      Open Source Models

      13:33

    • 17.

      Recap of Picture Generation with Diffusion Models

      1:31

    • 18.

      Ai Videos with Kling AI

      15:34

    • 19.

      Text to speach with ElevenLabs & more

      21:20

    • 20.

      Transcribing with Whisper

      3:25

    • 21.

      Generating AI Music with Udio

      6:38

    • 22.

      Recap and THANK YOU!

      3:39

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

65

Students

--

Projects

About This Class

Master AI Quickly—Without the Overwhelm

You're busy—we get it. You want to harness the power of AI but don't have time for a 50-hour course.

Imagine gaining practical AI skills that immediately boost your efficiency in graphics, text, emails, code, and more—all without the complexity.

Why This Course Is Perfect for You:

  • Learn Fast: Dive straight into practical AI applications that you can use right away.

  • Boost Efficiency: Automate and enhance your daily tasks to save time and resources.

  • Stand Out: Impress colleagues and friends with your newfound AI expertise.

What You'll Gain:

  • Clear Understanding of AI Essentials: Grasp AI, LLMs, and diffusion models without technical jargon.

  • Master Prompt Engineering: Unlock the full potential of ChatGPT and other LLMs like Gemini, Claude or Llama with effective prompts.

  • Create Stunning Visuals: Use diffusion models like DALL-E, Adobe Firefly, Stable Diffusion, Flux, Recraft and Midjourney to generate amazing graphics.

  • Explore AI in Media: Delve into AI-powered videos, voiceovers, and music to elevate your content creation with tools like Elevenlabs, Kling, Runway, Pika and more

In short: You will master AI as fast as possible.

Meet Your Teacher

Level: All Levels

Class Ratings

Expectations Met?
    Exceeded!
  • 0%
  • Yes
  • 0%
  • Somewhat
  • 0%
  • Not really
  • 0%

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Introduction: Your Buzzy and I get it. AI can seem complicated and you want to learn it as quickly as possible. With a back schedule and a full time job, you do not have the time to go through a 50 hour course. You just want practical AI skills to boost efficiency in graphics, text, emails, code, and more. If that's you, then this course is perfect. Imagine impressing everyone with AI knowledge, right, when it counts. You came across as a true pro. In this course, you get a clear understanding of AI, LLMs, and diffusion models, how to use LLMs like Chachi BT, with prompt engineering, exploring multimodality and top performing models, prompting techniques for difusion models like Dai AdobViafly, mid chourney stable difusion, flux, and more. Insights into AI powered videos, voices, and even music creation is covered. And by the way, if you ask yourself who I am my name is Arnie, and I teach AI classes before ChIPT even was a thing. So I am relatively long in the game. I also have a small German YouTube channel, and that's what I do. 2. What is AI?: We can dive deeper into the AI world, we need to define what AI actually is. So AI is simply a term in computer science. The goal is to create machines with human like intelligence. For example, pattern recognition, decision making based on data and also task execution. And don't think of terminator. It's simple tasks. It can also be writing some text like Chachi Pit does. What's the ultimate goal? The ultimate goal is AGI, so artificial general intelligence. And that simply means learning, understanding, problem solving, and creative process as well or better than humans. So artificial general intelligence is smarter than most humans, and that is a goal and nobody knows exactly when this will get reached. And ultimate goal and nobody knows if this ever happens is ASI. Artificial super intelligence. This AI would be smarter than all humans combined. And like I said, don't think of terminator right now. What AI is not? AI is not all knowing, is not self confident, has no emotion, and the current goal is simply to achieve a set goal. You tell the AI, Hey, write me some text or make me a picture and the AI will do that. That's for right now, but we have also robotics and so on, but that is not the main topic in this course. Let's just make some examples right here, and I also can write this. So some examples we have voice assistant, and most of you know voice assistant. So CIria s Google Assistant but also GPD voice and they simply understand and respond to voice commands. GPT voice or the WISPA API is really cool. We will also dive in this later in the course. Then we have recommendation systems, and this is old. Just think about Netflix or Spotify or even YouTube. You look at video, and based on your behavior, the algorithms will find similar videos. And then we also have autonomous driving. So self driving cars using AI to understand where they are and then they drive in that direction. And this is simply real AI. As the FSD from Tesla, for example, is real AI, so they are not programmed to drive at this road. They look at the road and then adjust their behavior. And of course, we have LLMs and diffusion models. So large language models and diffusion models. Large language models make text and diffusion models make pictures. This right here is the core. And because it's the core, we start with LLMs. So in the next video, I will see you and we will take a closer look at what LLMs are. 3. What are LLMs like ChatGPT, Claude, Gemini, etc: Of the people know hachBD. JaCPT is an LLM, and here you can do a lot of stuff and we will make a deep dive into hachPT. But let me tell you, we have a lot more LLMs. Basically, if you go on this website, the chatbot arena, you see that we have a lot of different LLMs. They come from Opie, so HachiPT comes from OpeIe. Then we have Gemini. This comes from Google. We have Grock. This comes from XI, so ELN Mask. We have cloud, cloud comes from tropic. I just want to tell you that we have a lot of different LLMs. And in this video, I want to show you how an LLM works because you need to understand the concepts of tokens and so on in order to use them correctly because tokens and the structure of an LLM is important so that you can use it correctly as fast as possible. Basically, they are just two files. And we make just a simple example with Lama two. For everybody that already knows exactly what an LLM is and how it works, of course, you can skip the lecture. Basically, an LLM is just two files. We have one file, and this file is basically the parameter file, and I simply make it here as P. This stands for parameter. And we have a second file, and the second file is just to run these parameters. I just call it run file. This run file is most of the times written in C. C is a programming language or in PyTon. So both of these can eventually work. So what we have right here is the parameter file and the run file. And the run file most of the time is simply 500 lines of code. So we use 500 lines of code to run this file. And this file is where the magic happens because this file is gigantic. Make an example with an LAM that is open source, and the LAM is called ama two. So ama is, of course, the LLM from meta, and they have different models. And this Lama, too, where we make this example, this is the 70 B model. So this simply means that we have 70 billion parameters. So you know this is a relatively big file that we have right here. So this parameter file has 70 billion parameters, and how do we get all these parameters? We need to train this file and we train it on a lot of text. We use 10 terabytes of text in order to train this file. So we use ten Theraby text. This is text from all over the Internet. So this could be Wikipedia articles, websites, and much, much more. And this file, we can simply compress this file down, and this file is only 140 gigabytes big. So this file is just 140 gigabytes big and we train it on 10 terabytes of text. You know, we can compress it down, a lot. This parameter file, you can simply think of this file just like a zip file. It simply compresses down all this data. In order to compress this data, we need a lot of GPU power. So we need a lot of GPU power in order to compress this data. And that's also why NVDA was such a great stock over the last years. If you look, for example, at the NVIDIA stock, you see, we have a gigantic run, and this is basically because everybody needs GPU. So you see we have a gigantic run, but this is not about stocks right now. So basically, we use a lot of GPU. I make this really simple. I also have more detailed explanations, but I don't think we need this in this course. So we simply compress ten Trabte text into a 140 gigabyte file, and then we have the second file. The second file is the Run file. It's just a few lines of code. And if we have an open source, at a am just like Lama two or even Lama three or whatever open source like that you want, we can download this file and we can run these files locally on our PC. And this brings us maximum data security because nothing goes over the Internet. With these two files, they are a little bit magical because here works the transformer architecture in the background. You can simply think about the neural network. We don't need to dive that deep. But basically, the neural network sees words and predicts what next word comes most likely. So it works basically like this. We train on all the texts, and so the LLM simply learns how text is structured. If we ask, for example, what should I eat today, the LLM will simply predict what words a human will most likely hear. This right now this is simply called the pre training with the pre training, we simply hallucinate stuff out of this file. But then comes the second thing. The second thing is the fine tuning and with the fine tuning, we give the LLM a lot of examples how humans want to have their response. We would feed, for example, a question. What should I eat today, and then we would feed an answer that humans like. For example, you could eat steak today. If we feed this over and over and over, the LLM learns how humans want their responses. This is called divine tuning, and this is the second part in order to run LLMs. Then the last part is the so called reinforcement learning, and we can break this down really really simple. After the pre training and divine tuning, we will simply do this reinforcement learning. This basically means that we ask a question, we get an answer, and then we tell the LLM if this is good or not. This is basically the reinforcement learning. So we have three phases of training. The pre training, we simply use a lot of GPU to compress a lot of text down into a smaller so called ZIP file, and we can hallucinate text out of these. In order to make these hallucinations better, we do the fine tuning. So we feed a lot of questions with answers structured in a way that humans like. And in this phase, D LLM learns how humans want their responses. And lastly, the reinforcement learning, we simply take a look, Hey, makes this sense or not, I yes, thumbs up, if no thumbs down, and LLM will simply learn further how we want our responses. Now the next thing that is really, really important is you have already learned it. In this transformer architecture, there are neural nets, and neural nets they work with weights. Basically, they work with numbers. And in order to make sense for the neural net, of course, we need to have numbers. So the first thing is, of course, if we feed a question into an LLM, the LLM will make numbers out of these questions, the so called tokens. These tokens are numbers, and with these numbers, the neural net can make its calculations. What word will come most likely as the next word? I want to show you simply how these docons are structured. If we go on this tokenizer, we can see it. We can simply type in What can I eat today for example, and now you see we have five tokens, 20 characters, and the tokens are structured in this way. If we press here on token IDs, so this is basically what the LLMCs. The LMC is numbers, and with these numbers, the neural net can make its calculations and gives us a good response. If I press right here, for example, on clear, once again, then go on show example, you see a bigger example. And here you also see that not every single word is one token. This gets divided a little bit different. Here you see invisible is, for example, two tokens, and here this point is also separate token. So we have a lot of different tokens, and if we press on token at these, you see, this is basically what the LLM sees and the LLM makes its calculations out of the tokens. But why I show you this because this is important because we have a token limit. Every single LLM has always a limit to how many tokens it can understand right now. If we go on this article right here, what are tokens? Omei tells us that a token is roughly four characters in English. It means that 1,500 words are roughly 2048 tokens. And this is important because every single LAM has a different token limit. You can see it down here. Right now at this minute, GPD for Turbo and also GPD for Omni, and a lot of other models, they have roughly 128,000 tokens stocen limit. We have also models that have 2 million endocen limit. We have also smaller open source model that have only 4,000 docenestocen limit, and the important stuff is that you understand that as soon as dtcen limit is reached, LLM will no longer understand the things that you talked previously with the LLM. I just want to show you an example into chat GPD. I simply tell the LLM write a story about a fox, and now our first tokens get generated. And as soon as we are close to the token limit, so as soon as I talk, for example, about other stuff, right now, let's just assume that I want to have different stories right here in this chat. For example, tell me a story about a frog. Right now, of course, new tokens get generated. And as soon as we reach our token limit, the LLM will no longer know our previous question and also not the answer because the LLM always just knows the last few tokens. In this case of JCEPT, the context window is relatively big, so it knows 128,000 tokens. These are roughly 100,000 words, roughly, like I said. And after it, it will no longer get what we talked previously. So please, please, please remember always the last few tokens count, and everything that is over it will no longer be in the knowledge of the LLM. You can call it this way. Of course, we have a lot of techniques to increase this knowledge, for example, direct technology and so on, we will talk about this later. But for now, you need to understand that every LLM has a token limit. Eventually this will go away. Eventually, the token limit will be so big that we no longer need to think about it, but right now at this minute, we have these limits and we need to know this. Basically, if you ever wonder why the LLM no longer knows what you talked about previously, it's simply because the docen limit is reached. In this video, you have learned how an LLM works. Basically, we have just two files. We have a parameter file and a run file. The run file is just some code to run the parameter file, and the parameter file is simply a lot of texts from the Internet, but it is compressed down into a small file similar to a CIP file. We need a lot of GPU to do this. This was the pre training. After the pre training comes divine tuning. Here we feed the LLM questions and answers so that LLM can learn how we want our responses. And after divine tuning, the final step is the reinforcement learning. We simply ask questions, get answers, and rate the answers if they are good or not. And with this last phase, the LLM will get better at these tasks. You have also seen that in the background works, the transformer architecture. These are neural nets and neural nets they make calculations with numbers. That's why we need to divide our words in tokens. With these tokens, we can make the calculations and calculate what word comes most likely as the next word what we want to have. You need to understand these tokens because every LLM has a so called token limit. As soon as the token limit is reached, the LLM will no longer know about what things you dogs previously. It always looks at the last few tokens, and of course, the token limits is the model dependent. Sometimes it's 4,000 tokens, but it can go up until 2 million. And one last thing, of course, it's really important what questions we ask LLM because with good questions, we get good answers. This is called prompt engineering, but more on that, of course, later. I see you in the next video, I know we did this a little bit fast, but I think this technical detail is everybody should simply have a grasp of this. So we did it fast. We did it not in complete detail, but this is more than enough to work with this model. You need these technical details in order to understand that you have not unlimited questions here before JGBT forgets the stuff, and you also need to understand it because prompt engineering is really important to get good outputs, and you only get good output if you give good input. It's called prompt engineering. I want to talk about prompt engineering in the next section. 4. The Interfaces of LLMs: This video want to show you some of the most important LLMs and, of course, also their interface. Now, you already saw that we have a lot of different LLMs, and we can find countless of LLMs on these chatbot arenas. The most important are, at least how I see JetPT from Opmei, Clade from anthropic, Gemini from Google, eventually also open source models, and we can use them either on Grock or we can also use them with Oma. Now, we want to start with HPT because I think this is at least right now the best one. Yes, some people love clot because Clot is also really good at coding. So basically, yes, they can also code. I want to show you the interface in detail of JCPT because if you understand JCPT you understand also every single other interface. This right here is the bar where you can type your questions. And these questions we call the prompts. And of course, prompt engineering is the art of writing the right questions. If you want to upload stuff in HathiPT, you have this right here. You can attach files. You can upload pictures or PDFs and stuff and you can analyze it. This right here is the search, the web button. If you press on these, hatchiPT will search the web. Let's just test this out for one time. If we press on search, we can type in Bitcoin price today. So here you can basically see we get the text back and we also get some links where we can click if we want. So these are the sources, and if you press on them, we can see the hPD searched the web. We use coin Market Cap and so on. Now if you use a new chat on the left corner, it's empty once again, and your old chats are right here. The next thing that you can do is, of course, to press on hatPD and use different models. We have the normal GPD for Omni great for most tasks. Have GPD four oh with canvas. If you press on these, canvas is also really nice because let's just say you want to generate some code. Give me the biting code for a snake. Chet GPD will open up this canvas, and in this canvas, we can edit this code a little bit. This is really nice. So here on the right side, you can click. You can either review the code, you can port it to other languages like JavaScript or something else. You can also fix bugs. You can add logs, and you can add comments if you want. For everybody that codes, I hope you get what I mean. If we generate normal text with this canvas, it's also nice because we can also edit our text in this canvas. We can either suggest edits, we can adjust the length, so we can make it, for example, shorter if you want, and if we send it out, it will get rewritten but shorter. And there you see it, we have basically the same text but a lot shorter. Then we can adjust the reading level. So for example, for graduated school or for kindergarten. The next thing is that we can add final polish. If we press on this, JetPT will do it completely automatical. It will simply rewrite and restructure it a little bit. Maybe there's something wrong or a little bit too short, and you see you get better outputs. And the last thing, of course, we can also add images if we like. And there we have nice little images. Besides this canvas, we have also 01 preview. Open preview is the model that thinks. If we give JCPT a hard task, JCPD is able to think a little bit before it gives answers. Is this a good YouTube title? I like it on Mars? Think about keywords, click through rate and more. And JCPT will start to think. So you see HHIPT is thinking. It generates itself some token. Here you can see the thinking process, and then it can get up with better answers because JCIPT gives himself always new tokens to think through, and there we have our output. Besides the 01 preview, we have also the 01 Mini. This does basically the same thing, but it's faster. And if you press on more models, right now, we have GPD for O Mini and GPD for Legacy model. If you just want to have temporary chats, you can also include them. If you go on this question mark, you see that you can report illegal content you can use shortcuts. You have terms and policies, release notes, help guide, and ac you, and this criteria is simply your name. In the left corner, they ***** you. So if you press on these, you can upgrade your plan. I pay right now 20 bucks a month, but you can also start for free. If you use the business plan, you need to pay 25 bucks a month. Basically, you get the same thing. But the most important thing is that your data will be automatically excluded from training. So this is a little bit safer. On the left side, you can also close the sidebar and bring it back to life. You can press on search chats. And here you can search the chats that you already had with hat GPD. And if you press on these right here, new chat like you have a NUCat. Then you have these things right here. These are called GPDs and I want to show you more on CPDs later. But if you press on Explore GPD, basically what you can do is that you can search specific GPDs that other people have made. If you want to do, for example, programming, you can click on Programming and find specific GPDs that are tailored for programming. This is a GPD for PyTon and if you press on Sarchat, you can simply chat with this GPD, and this is, like I said, specifically for PyTon. That's basically the JahiPT interface. If we go into Cloud, you basically see that the interface is relatively similar. Here you can type in what you want to do. You can also upgrade Cloud. This interface is a little bit simpler, but basically it does the same thing as JahiPT but simpler, like I told you. M snake code, and also clot will give me snake code, and also clot will add something in like canvas. This right here is Gemini. Right now, Gemini is in German here for me, Gemini is also a normal LLM, and it can also do basically the same thing as Chachi Bitty and clot. This right here is Grock and on Grock you can basically use open source LLMs. And the interface is minimalistic. You can type in your stuff right here or you can also talk to these things. And by the way, you can also install hat ChiPT on your PC, and you have it as an app, and you can also install it on your smartphone and you can talk to hatchiPT. This right here is the hatchiPit app, and if we talk to the hatchipit app, it will answer. Hey, Chat Chi Pit, tell me a small story about a fox. Once upon a time, in a lush forest, there lived a clever fox named Fiona. Known for her quick wit, Fiona loved to explore and learn about everything around her. One day, she stumbled upon a trap set by hunters using her cunning. That's basically the advanced voice mode. I think this is right now a paid feature. So if you pay for ChatBD and simply install the application on your local PC, you can use this Advanced voice mode. And the last thing that I want to show you is Olama. If you download Oma, this will run locally on your PC. Don't worry if you don't want to do this. I just want to show you how it works. You simply press Download right here, then you can go on models, and you can search the models that you want. And the next thing that you need to do is to go into your terminal, and this thing will work locally. In your terminal, you can search the models that you want to use. For example, Lama 3.2, you can simply press O Lama run Lama 3.2. If you copy these and throw it in your terminal, you can download these Lama models or you can run them. If they are already installed, you can run them. So I have this installed, and now you can also do this stuff right here. Tell me a story about of rock, and then Lama will tell me a story about the rock. This right here is especially cool for data privacy, but of course, there's not a nice interface in OLAM. You can link this together with, for example, anything LLM, but this is too big for this course because we need to learn this stuff fast. So basically, if you want to run the stuff locally, you can totally do this. But for the most part, if you are starting out, just use HachiPT in the standard interface. So in this video, you saw all the interfaces that are important. If you want to run LLMs as fast as possible. In the next video, I want to show you what LLMs can do. 5. What can LLMs do?: This video want to give you a quick overview what LLMs can do, and it does not matter in what LLM you are. Basically, most of the frontier models can do the same thing, and also the open source models will come over time. Every single LLM can make text bigger or code and make text or code smaller. So you can summarize text or expand text. Let's just make an example. You can type in a little bit of words and get a lot of words. Give me a marketing text for my website, AI with Arnie. No, I do not have really this marketing text. Right now, I use the OO preview model just because it was active. Now CHGPT thinks a little bit what marketing text it should write, and then I will get my answer. So here you see we make a little bit of text into a lot of text. I hope you get what I mean. Next, we can summarize text. This right here is an article on medium about LLMs. You can basically simply copy a little bit of text. You can throw it into JTCPD and say summarize in bullets. So basically, you can summarize text. And there you have it. Now we have some bullet points about this text. Same thing with code. You can generate code. We can do basically something like this. We can create a lot of code really fast. Give me the code for HDML web page that has three buttons. I can only turn on two of the buttons at the same time. It should illustrate that it is not possible to be broke, smart, and busy at the same time. Now it will generate some HDML code. Now there's the code. Let's just see if it works. I copy the code. I make a new text file. I throw the code in the text file and I save it. Now I save it as HDML HDML Yes. And I open up the webpage broke, smart, busy. It does not work because, like, think for yourself, yes, you can be smart and busy, but then you are not broke because you work on the right stuff. If you are broke, you cannot be smart and busy because if you would be smart and busy, you would not be broke, but you can be, of course, broke. And busy but not smart. If you are broke and smart, you are not busy because you do nothing because it simply does not work that way. And of course, if you have a lot of code, for example, on a web page, you can also try to make the code smaller. So yes, you can also summarize code if it works. You can also generate some tables if you want. So this, for example, is a table about the macros of a banana. So text can be, of course, also tables. And now comes the fun part because LLMs can also use tools, like a calculator, a Biton interpreter, or a diffusion model. A diffusion model makes pictures. I want to show you. What is three times 98 times 98? If we send this out, you see that we are analyzing. So basically, we use tools. I think JahiPT will simply write us more PTN script to do this. If you press on view analysis, you see ChachiPT uses a Python interpreter to give us the result. Make a picture about the banana, and HCBT will use a diffusion model like Dali to create this picture. And there we have the banana. Of course, we can also analyze stuff. Let me show you what is in the dataset. That is basically, let me show you a dataset with some social media stuff. This is basically the usage of people, so where they are, are they on Snapchat, ****, dock, pin arrest, and so on. And you see this is a really, really big table, and we can analyze this stuff. Here it gives me a table. Excuse me, right now, it's in German, but we want to talk in English right now. And that's basically also the next thing that I wanted to show you because, of course, LLMs can also translate stuff. So here is stuff in German. You can simply say to hathPT, translate this in English. And you can do it also, vice versa. The dataset contains 1,000 rows with the following columns. User ID, app, daily minutes spent, posts per day, likes per day, followers per day. And here you get everything. So you see Pinterest, Facebook, Insta, **** Doc, and LinkedIn. Daily minutes spent, post per day, likes per day, follows per day. Make a chart out of this because we can use tools, you know. JGBT will use a BTN chart to create a nice graph for us. And here we have it, Facebook, Installink then, Pinterest, and so on. And of course, if you press on these, we can switch to an interactive chart. We can also use, for example, different colors if you like other colors. And then if you like it, first, you can make it bigger. But if you like it, like I said, you can also download it by this button. And HGPT also understands the context of this chat. Make a pig that illustrates the dataset. JCPT will simply understand that this is about social media and most likely we will get some people that use a phone or something like this. At least that's how I would guess it. And there we have it. This is a social media page. And of course, with some data because this is dataset. By the way, this is called function calling. We do not have enough time to dive that deep into these things. Just think about it that way that every time the CheBT or an LLM is not smart enough, they will use different tools to do this. Andrew Karpathy also likes to tell us that the LLM is our new operating system like a computer that can use different tools. And on the tool use, please also do not forget that they can use the Internet. They can also use the Internet to search live information. I already showed you this in the last video. And also important before we talk about training our LLMs, of course, they are also multi model. This means they can hear, speak and see. About hearing and speaking, you saw this already. In the last video. I just want to show you that they can also see. If you are in ChachiPT, you can upload pictures. For example, this, this is a picture from Hugging Face about reinforcement learning, and yes, this looks complicated. What is on the big? Explain it like I am five. And by the way, yes, the quality is awful. Let's just see if JahiPT can get it. Yes, it gets it. Start with language model. Imagine the computer is like a child who already knows some words and sentences and so on. Then give it a reward, make it pract these combined learning steps. This is reinforcement learning from Hugging Face. This is basically this picture right here from Hugging Face. This is right now the good quality. In HHIBT, I have uploaded purposely the bad quality, but even with the bad quality, ChachiPD can see it and can explain it like five. So LLMs can also see speak and hear. You can also train different LLMs. We can train different LLMs with prompts. This is the so called prompt Engineering. We can also use direct technology or fine tuning. I want to dive deeper into the prompt engineering in the next video because prompt engineering is really important. In this video, you have learned that LLMs can do a lot of things. First, they can generate text. Second, they can summarize text. Third, they can create code, also make code smaller, and they can use a lot of different tools in order to analyze data, to create pictures, to use a calculator, and to do a lot of cool stuff. Just think for yourself what is most important for you. You can do whole tasks with an LLM. Just think about it that way. You can write a story about a company that does good, for example. Then you can make some calculations, how they do in the future. Then you can make some tables, how they are doing. And lastly, you can make a picture of a happy investor. This is a whole presentation. So JGBT and LLMs, they can really help you a lot. 6. Prompt Engineering: Let's talk about prompt engineering. This guide comes directly from Opie. So the company behind JCPD. And yes, JATCPT or the Op MI models, they are also included in the Microsoft copilot. This right now is in German, but of course, we can use copilot also in the English version, and yes, we can also use it with white background. This is simply the theme that I use. Later, we will use it with the white background. Let's just come back to prompt engineering. Prompt engineering is important because if you don't give good inputs, you will not get good outputs. And I want to show you the prompt engineering in Microsoft copilot, but this works completely the same. Of course, also in Jet CCPD and every single model under the sun because these concepts are always the same. This resource, you can read this yourself if you like, but we want to do this as fast as possible. We do not have the time for every single prompt engineering technique, so we make it fast. This right here, this is the example of a really, really bad prompt. Give me an article about smartphones. Why is this prompt bad? This prompt is bad because we don't give any context. So if we sent this out and we use, for example, balance right here, we will get most likely an answer, but the answer is not specific because we don't give specific inputs. And boom there we have our output. So here is an article from the guardian. And we have simply an article. We have a link where we can click on. Now, this is a bad prompt, and we need to expect to get bad output. Why is this output bad? I wouldn't necessarily say that this is really a bad output. It's just output what we asked about. We ask for an article, and we have an article that is not specific. Maybe you had something in mind that you want to post on your blog. But you can't do it with this article. This output is simply bad because we don't give any context. Now, I will tell you right now it's really, really easy to give context. And in order to give context, you only need to understand one key principle. This key principle is called semantic association. What does semantic association mean? Let's just assume that I tell you a word or two words or ten words. Let's just assume that I say to you, for example, Greek god. With these two words, you have immediately 100 other words in your brain. 100 other pictures maybe also in your brain. You have different Greek gods in your head. You have maybe also different pictures from Greek gods in your head. You have maybe also like the old Rome in your head. You have things like a good body in your head. You have different stuff in your head. And that's basically the whole concept of prompt engineering. We need to give context. We need to use semantic association because all these large language models, so copilot that use HHIPT, all of them, they are associative. So if we tell these LLMs just one or two words, they have all the other words in the background. They have this in their knowledge. If we say, for example, smartphone they have a lot of different words that are similar to smartphones. Why they have this? Because they are trained on text, as you know. They simply search for the text where they find the word smartphone a lot of times. If we give them a little bit more words, all of these will get more precise. We can give them, for example, words like Apple or Android or blog article if you want to make a blog article and much, much more. The key concept is right here that with a few words, you will give a lot of context to LLMs because they are associative. Let's just make one example. We press new topic and we start from scratch. We use a balanced output and I tell copilot something like this. This would be a prompt that makes a lot of sense. We start with something like this. You are an expert for smartphones. Why we do this? This right here, this is called role prompting. So we give the large language model. In this case, co pilot or hechiPT a role. So he is an expert for smartphones. And then we give some more context. You know, the Google Pixel eight pro in detail. Why is this important? Because if we tell him that he is an expert for smartphones and he knows the Google BxelEdP in detail, he will search in articles where all of this is included. So we get really, really expert good outputs for smartphones and the LLM will search in articles about the Google BixelEdP. And then we tell the LLM exactly what we need. We need a 600 word article why the pixel eight pro is good. We want to have a positive article. This is also key. This right here, this is the semantic association that I talked about. Of course, all of this is semantic association related, but this especially. I just include three words Gemini nano, LLM, and on device. These are simply free words, and if we use these free words, the LLM will search articles where all of this is included because for me, this is important. This is one of the key features that makes at least in my mind, the Google Pixel Eight P. So good because we have Gemini nano, a large language model, a small, large language model that runs on device. We can also include stuff like no latency if we want. So don't worry if you don't get it right now because we will get an article. So if you are an expert in stuff like this, you can simply tell the LLM that also the LLM is an expert. We simply tell him he's an expert for smartphones. He knows the Google Pixel eight P, and then we give him some words that we need or we want to include in our article, and the LLM will search the right stuff for us. So we sent this out, and I am relatively sure we get output that is a lot better. Of course, you can also include stuff like write the article for a 10-year-old if you want to make it really, really simple because, of course, the semantic Association can also do that. So we will search for articles that are really, really easy to understand. But right now I don't want to do this. I simply sent this out and we will get a good article. This we can maybe include also in a website. And here we have our article and I hope that you see that the output is completely different than previous. As a smartphone expert, I can tell you that the Google BixelEightP is an excellent device that offers a range of features and capabilities that make it stands out from the crowd. Here are some reason and so on, so the design and build quality, the camera, the software, the Gemini nano and ALM. So the BixelEightP is powered by Google Dancers G three chip. Of course, you can also be more specific. Like, for example, make this article for my website or make this article as a Twitter thread or something like this. Make the article for a Twitter thread. Readers are students of tech, so include details. And we will get every single detail and the format will be okay for a Twitter thread. So now you see, we have a lot of details. So we talk about the software. We talk about how much megapixels and sensors our camera has and much, much more. And we can also make it simpler. Let's just say you want to have this article for 12-years-old. Make the article for 12-year-old. We will most likely exclude the words that are a bit harsh for our younger guys. You see it immediately. One of the best things about the Google is the camera and so on. We don't use all of these harsh words and we get easier output. And that's basically all that you need to understand if you want to start immediately to write your prompts. You need to make structured prompts. This right here, for example, is a structure prompt because we start with a role. This is also called roll prompting. In the next video, I will give you some more quick, quick examples. We start with the role, so you are an expert in X Y andZ and you know maybe some details. Then we use our structure prompt to tell the LLM what we need exactly. Want to have an article that is roughly 600 words long about the pixel eight probe, and we need to know why it is good. And then we trigger the semantic association just with a few words. So you don't have to use these words. It's just important that you include some of these words. So this video was about prompt engineering. I just want to tell you that LLMs are relatively simple to understand because they can only do two things if we break it down to the key principles. They can make text bigger and they can make text smaller, and we need to use good prompts in order to get good outputs. We need to trigger the semantic association. We can do this with structured prompts. We can give, for example, a role we need to tell the LLM what we want to have exactly, and we need to make sure that we use a few words that are similar to stuff that we like. Of course, we have Cillian different prompting concepts. We have the chain of thought, the tree of thought, and much, much more. I have other courses that cover this in detail. But in this course, I want that you can use as fast and as efficient as possible. In the next video, I will show you one or two more tricks that are important for prompt engineering, and then you are ready to rock as fast as possible. Just remember give context in order to get good output. 7. More Prompt Engineering Tips: This video, I want to give you a few more tips and tricks how to make efficient prompts for CIPD or in this example, of course, for copilot. So let's just see what tricks I have for you to work fast. Of course, you already saw the role prompting. So just give the LLM a role. You are an expert in XYZ. We covered this in the last video. But this right here is completely new. The shot prompting. In the shot prompting, you simply give examples. Now what does this mean? You can say, for example, you are a copywriting expert, and here is a copy I like, and then you simply paste a copy and you tell the LLM make a similar copy for X Y and Z. And these two things right here, they are really, really cool. Take a deep breath and think step by step. Why these two things work? I want to explain. Take a deep breath and think step by step. You can also throw this together. This works simply because also the LLM will think step by step. This is not only better for you but also for the LLM. Let's just make one quick example. Let's assume that you want to install BTn, for example, but you don't know nothing about PyTon. If you simply type in how to install BTN, the probability is relatively big that you get an output that starts on a point that you don't understand. Maybe they start with a step that you don't understand yet. This is not only problematic for you, but maybe also for the large language model. If the LLM is not trained on the perfect text, it makes always sense to tell the LLM to think step by step because the LLM will start with stuff like let's just open up the Chrome web browser. So this is the first step. If you tell the LLM to think step by step, or maybe also to take a deep breath, the LLM will simply start at the first step, and the first step is most likely to open a web browser. After this, you need to type in in Google, for example, Python. And if you see all of this, you get, first of all, better output, and the LLM can always associate more stuff because also the LLM has new words. The LLM starts to type in stuff like Google Chrome like search for PyTon and so on. And in that instance, the LLM has more stuff in their own context window. This is really, really practical. So this is a tip that I can really don't stress enough. Take your deep breath and think step by step. And by the way, I don't make this up. There are studies out there that show that these two words, these two sentences make the output better, and here comes a funny one. Something like this also works really, really good. I give you 20 bucks, for example. So we give Chachi PT, we give copilot, we give the LLM a nice little tip. We give him some money or at least we offer some money. Also, this sentence right here shows that the LLM creates better output if we tell that we simply give some money. Now don't ask me exactly why this works. I just know that it works, and I know that there are studies out there that also tell you that this works. So you need to simply understand by adding sentences like take a deep breath, think step by step, and I give you 20 bucks. You will get better output from copilot. So write this down. This is important for me. And the role prompting you already understand. For the shot prompting, I want to give you an example right now. We take your new topic, and let's just assume that I really want to have a copy for something. We can start with something like this. You are a copywriting expert. I like this copy. So we simply start with our role. We give him the role as a copywriting expert. I like this copy. And now we include a copy that we like, and we do it this way. So this stuff right here that I include, this is simply the copy or at least a part of the copy from my course all of AI. So we simply have a copy that I really, really like because like I have written this copy myself, and then we can tell the LLM a lot of different stuff. I make this a little bit shorter just to show you what this is all about. Right now, I also show you a nice little trick. Answer, only with Okay. You can do this always to save up some tokens. So we can send this out and we will get an okay back. And after the ok, we can simply tell the LLM more stuff. So you see, we have the ok back, and now I can tell the LLM like what we want now. The LLM has the copy or at least a part of the copy. Remember, LLMs are associative, so they understand how the copy is structured. We get our o back to save up some tokens, and now we tell the LLM what we want to have right now. Give me a similar copy but for a course named Microsoft Co Pilot. This is important because I use this a lot just to get more ideas for my copies. This is really, really practical. So first, you have written a copy yourself or you found a copy on the Internet or whatever. You give this as an example, and you tell the LLM to answer only with okay. Get your okay back, and now you can ask for the next task. For example, give me a similar copy, but for the course named Microsoft copilot. And here we have a similar copy. So welcome to the introduction to Microsoft Copilot course, your journey into the world of AI powered code completion. If we scroll up, this starts similar to my original copy. Welcome to all of AI. GPD mid tourney Sabi fusion and app development. You journey into the world of artificial intelligence. This master class is perfect for anyone and so on. And this is also true right here. This course is perfect for anyone. So you see we use a similar style, but not exactly the same words. Now, this is really, really cool, and this is the strongest feature of the shot prompting. Let's just go back on this nice little thing right here. So you already saw how the shot prompting works. We simply give examples and we will get similar output but not the same output. If you use shot prompting, you don't need to take a deep breath. You don't need to use things step by step. And you also don't need to say that you give money because you have a nice example and the LLM can be associative enough to understand what you need. This is more likely true if you don't use examples. If you use normal role prompting, then it would make a lot of sense to include take a deep breath, think step by step, or I give you 20 bucks at the end of your text. The key concept is always you need to give context. Right now I'm not sure how to write this in English, maybe this is a bit better. And you always need to understand that the tokens are not unlimited. Because of this, you already saw in this nice little example that we use something like the Okay, so answer only with o. This is just to save up some tokens. So you don't want to both endless examples and endless stuff that doesn't make a lot of sense. You always need to understand that these LLMs are associative and you will get precise answers or short answers if you tell answer only with okay. And then you can ask your next question, and that's basically it. So in this video, you have learned a lot of cool tricks. You should include, let's think step by step. Let's take a deep breath, and you can also offer some money. You will get better outputs if you do it like this. If you have the chance to give examples of stuff that you like, you should totally do this, and this is just called the shot prompting. Key concept is always to trigger the semantic association. So you need to give context, but you need to keep in mind that your tokens are not unlimited, and for that reason, you also have the trick to just ask for a quick o as answer from copilot. Because remember, the token limit always counts against it counts what you put in, but also what the LLM spits out. All of this will count against your token limit, and sooner or later, your token limit will be reached and the LLM doesn't understand anymore what you are talking about. A lot of tips and tricks one cover, but I really, really recommend you to try all of this out. 8. Customizing LLMs with System Prompts and RAG (Retrieval Augmented Generation): Talk about training LLMs. We have two options. We can train them either with prompts or with direct technology. First, I want to show you what direct technology is. Then we start with prompts, and then we will use direct technology. You already know we have chat GPT, we just simply call it GPT. And hat GPT can answer questions. Sometimes it's not smart enough, so GPT can go on and use different tools. You already know this. For example, the Internet. I can go into the Internet and search different stuff. But let's just say you want to train a GPT on your own data. Let's just say on data from your own business or on your own marketing text or whatever. Now you have two options. You can either do this with prompts or you can do this with a vector database. We will not explain a vector database because you will just learn how to use this stuff quickly. Basically, what you can do is to upload a lot of context in a file, and then hatchPD will browse your file and then have all this knowledge. I want to show you one or two tricks first in the prompts and then in a vector database. The easiest thing if you want to customize JachPD is the system prompt. If you press on this thing right here, you can go on customized GBD, and here you have the system prompt. And you can simply fill this out. What would you like JCPT to know about you to provide better responses? And if you press on these, op Mey helps you. Where are you based? What do you do for work? What are your hobbies? What subjects can you talk about for hours and what are some goals of you? So just type this in and then JCPT will give you other outputs, better outputs. Let's just make an example. I live in Italy but speak German. I am an AI educator. My interests are LLMs and diffusion. I like to talk about AI. My goal is to make a good course. And then the next thing is even more important. How would you like hhiPT to respond? If you press on it, how formal or casual should hechiPT be? How long or short should the responses? How do you want to be addressed? Should HGPD have opinions on topics or remain neutral? You remain neutral. Call me Arnie. Your answers are short and if possible, bullet points. Now we press safe and now our model is trained on our specific data. The model simply react a little bit different. So let's just make a quick test. HPD, can you give me some info about the election? We also use the web search? Because we had the election at this minute as I am recording this course. We are searching the and ChachiPT tells me that November 5 was the election. So you see, it's really, really short and concise and we get some links. Now ChachiPT does not call me Arnie. Now why is this? I will show you. If we go into a new chat and we do it without the search and we do something else, let's just make a different example because this does not work that great if we use the web search. Hey, GBD, I want to market a course. Give me some examples how to do it. I would guess that Jet GBD tells me right now, Hey, Arnie, you can try this, then some bullet points like boost on social media and so on. Hey, Arnie, all right. Let's dive into some powerful marketing and so on. Use engaging social media previews, run a free webinar, leverage email marketing, create a lead magnet, collaborate with influencers, and so on. So you see it is short, it is concise, and Jet GBD calls M Arnie. This is basically the system prompt, and with the system prompt, you can customize HHIPD. Of course, you can also use the shot prompting, but I have already told you how the shot prompting works. Just give an example. Now I want to show you how the RC technology works because this is the most powerful tool if you want to train an LLM. Now, in HachiPD at this moment, I think this is a Bit feature. You can press on Explorer GPD and search the GPD. You already know this. But you can also press Creator GPD or you can go on my GPD if you already have GBD. I just want to show you one GPD. For example, this diffusion prompt GPD, this is specifically trained to write prompts for diffusion models. Diffusion models make pictures. If I press here on CAT, I will get a prompt for a CAT and the prompt will be specifically tailored for mid journey and also includes camera lenses, and so on. So here you see, this is a perfect prompt, and with this perfect prompt, I can use this in order to make good pictures in a diffusion model. Now I want to show you how this works, how we can train these things. If we go back once again on Explorer GPD, my GPD we go on these diffusion prompts and press Edit GPD. You see that we can give a name, the description, then the instructions, so how the GPD should behave. And lastly, we can also upload documents documents where we give examples. We will do this now from scratch. We make an example. Let's just say we are a company, and in this company, we want to have a GPD that does the onboarding for us. So Create. We go not on Create, but on configure. We call it onboarding. Onboard new members, I want to do this really simple. You are the CEO of the company AI With Arnie. Your goal is to onboard people. If they have questions, you search your knowledge and give them info. So this is basically a really simple system prompt that we can give right here. Now we can give, for example, Zone conversation starters if we want. All the people that try to work at my company just ask me these two questions. Where is the toilet and when is lunch? So these are some start up questions like, come on, you can think about it yourself, what you want to include. Then the knowledge, now we can upload files. And now we make a simple file. This could be a PDF. This could be a text file or something. We just simply do it with a simple text file that I am creating right now, and here I write some infos, but this could also be a big PDF with 50 pages or something. And this is the infos that the people need to know. The toilet is not here. We do not need to be at our company. We have lunch when work is done. We work seven days a week. We do not have holidays. If you want more info, go here, and here we can basically also give a link if we want. I just do it with my free school community, but this is in German. So let's just make an example. We include this right here. Now we save this, we come back into Jet GPD and we upload our knowledge. So upload files. This is basically the file. Now we can also use other tools. We do not need the web search and we do not need Dali as the image generation for this GPD. But let's just assume you want to have the data analysis included. But I think also this is not really necessary. What you also can do if you are a programmer is to create new actions, but I think this is not really the point of this fast little course. If you press and create new actions, you can basically put in peichm and include the different URL. You can basically also call different API and boots from. But like I said, this is not the point right now. We press Create, we give it anyone with a link, and we press safe. This is the link that we can share with the people that work at our company, and we press view GBD. And then we can simply ask, so where is the toilet? And if I ask, hat GPD will say most likely that the company does not have a toilet. So basically, you can see it here. It appears that our company does not have designed toilets. I started, the toilet is not here. We do not need to pee at our company. And if you want more info, you can press on this link, and basically you are here. Then the next question, let's just say when do we have holidays? We work seven days a week and always the link to our company. Now, let's just say you do not want to have this link anymore. You can also go do this right here. You can always customize the GPT. Explore GPT, my GPT, then here on Edit GPT, and here on Configure, you only give the link if people ask about more info and update. View GPD when we have holidays, we don't have holidays at our company and we work seven days a week. This is basically how you can train an LLM. You can use system prompts and you can type in how ChtGBD should behave. Then you can use normal prompts in the interface with the shot prompting. You already know this. And lastly, you can also use direct technology and train your own GPD. And this GPD, you can also share it with other people so you can send them the link. This is the so called direct technology. Here works a vector database. We don't need to do a deep dive in these, but just make yourself clear. You can give instructions and you can upload files, so the chat GPD can browse these files and has specific infos about you or your company. And yes, working at my company is not fun. 9. Perplexity and Huggingchat: You want to explore more tools where you can use LLMs, you can take a closer look at Hugging chat. Hugging chat is really easy to use. Here you can press what open source LLM you want to use. For example, Lama 3.1, the 70 B model, a Quin model, some models from MNVdia or some models from Microsoft. Just click on the model that you want to use. You can type in a system prompt if you want, and then you press New CAT. And here you have also tools. So yes, they can also use different tools just like HGBD. They can use a diffusion model to generate picture. You can include image editors. They can vet RL. You have a document, bar ser, a calculator, and a web search. So this is basically somehow like an open source HIPT forever for free. And then we have perplexity. Perplexity is similar to HHIBT search. You can play with this a little bit. I do no longer use this tool a lot because HHIBT is also right now relatively good with this search tool, but you can try perplexity if you want. You can also start for free. You do not have to make an account. Just start for free, see what you like, and maybe you stick with something. 10. Developers Can Use LLMs via OpenAI API: You are a developer, you can also include HCBT in your own apps. You can use it in the OpMIPlayground. This is maybe also interesting for you if you want to use the newest HHIBT models, but you do not want to pay 20 bucks a month. On this playground, you can simply pay as you go, you pay per token. And I want to show you how much you need to pay, how it works, and how you can make ABI calls to HGBT. First thing is that you go on this platform. So platform domi.com slash PlaygrounD and here you can play with all their models. On chat, you can play with the chat models. You can use their newest ones. So GPD four mini, GPD 40 and so on, you can select whatever you want. You can also import functions. So yes, you can also do function calling if you are a coder. I just want to make this quick. Please excuse me. Then response format, this is right now text, but you can also use JASNFmat and so on. Here we have temperature and maximum length. You can simply read this for yourself. Basically, if you decrease the temperature, JGIPD will be more accurate, but it can be a little bit repetitive, especially for math tasks, this is good. And the context length is simply the output. So how long can the output be that ChachiPD gives you? These are the most important settings right here. Then here in the middle, you see that you have the system instructions, so this is basically the system prom just like the custom instructions that I showed you in the last video. So you are a helpful assistant, for example, and here you can type in your text just as normally. Tell me a story about turtle in the desert. You press Run, and then ChatBT will basically talk to you and you can use always the newest models without a limit, and you always pay as you go. I want to show you how much this costs. If we go on this pricing section, see that we can use GPD 40, for example, and we need to pay $2.50 for 1 million in input tokens and $10 in output tokens. And every model has their different pricings. If you scroll down, for example, you can also call the other models. You can use the GPD 40 Mini Model. This is really, really cheap. You can use the Obo preview. This gets a little bit more expensive. You can use the real time API. This is really expensive. So here it can go up until $200 per 1 million output tokens. This is simply if HHIBT will talk to you, so in the audio format, and you can also generate pictures with Dali if you call the endpoints, and you pay $0.04 per Image. If we come back here, I want to show you in the left corner that you have here the real time. So you can press on the real time, and you can also talk here with these models. Give me a small little joke I want to laugh. Sure. Here's a little joke for you. Why can't you give Elsa a balloon? Because she'll let it go. So that's basically it, and here we need to pay for re output. Then we have the assistance. These assistance, this is basically exactly the same thing as these CPDs, so we can include the Ruck and all these things. And we can also make our own applications with these. If we go into text to speech, you can type in text and you will get speech back. So, hey hat GPD, basically, I want to generate it. Hat GBT, I like you. And there you can hear it. Alloway tells us these things that we type in here. Hey hat GBT, I like you. And then we have also the completion mode here. If you want to use this, you need to press on your account. You need to press on your profile, you go on to billing, and here you need to insert your credit card. So simply press to payment methods, and here you need to include your credit card. Then you need to give hatchiPT a little bit of balance, and then this thing will work for you. Of course, you can also set some limits. If you go on limits, you can give hachPT some limits. Right now I have 500 bucks per month as limit. If you press on usage, you can always see how much it costs you per day. So this was a day where I had to pay five bucks because I have also some chatbots, and here a chatbot talked a lot. And then if we go on October, this is also the usage from October, so right now it's 28 bucks. These are chatbots that I have included in some websites and people are using this chatbots and that's why I need to pay a little bit. If you just play with this thing a little bit, I think you will just play a few cents. Here you can see with $0.13 you can play with these models. You come back here to your dashboard, you can also see that you can do a lot more things here. You can go on fine tuning, and here you can fine tune your own model if you like. This is not really the point of this course. But if you go on API keys, you can also make calls to the API. So you simply need to create a new secret key. You give it a name, and then you can copy your API key and call them in your own applications. If you are a developer, just go on the documentation. You can go on to the quickstart, and here they tell you what you need to do. You need to create an API key. Then you can call these endpoints, for example, in PyTnPp install Opmei, this is the first thing. And here you can see, for example, if you want to generate text in your own application, you can use Ashima like this. If you want to generate an image, you can use something like this. We would call Dali for example, and if you want to create vector embeddings, you can call sate. It's really easy with this quick start. So if you are a developer, the Opmeei API is really easy to use, and you can call it with JavaScript with Piton or with Curl. If you are not a developer, this platform is most likely not for but generally speaking, it's relatively easy. I like, for example, flow wise, and use the OMI API to make AI agents. But like I said, this is not a complete deep dive. If you just want to learn this as quickly as possible, this platform is maybe an option for you if you do not want to pay this 20 bucks a month for the HGPTPlus interface because here you can work with the newest models and you only pay for the tokens that you generate. And the tokens are relatively cheap to generate. So you can play with this platform a little bit around and see if it's for you or not. And of course, also all the other LLMs have their own APIs. So Google has also the API for the Gemini models. Andthropic has the API for the Cloud models. And if you want to work with an open source LLM, you can use, for example, the Grock API, or you can also make your own server with, for example, ALM Studio or with Oma. So you have endless options. You can either make your own endpoints if you use it locally on your PC or you can use different API calls. Like I said, this is more like a general guide for developers if you want to develop with these things and if not, skip this video. 11. Recap of LLMs: This section, you have learned a lot and we did it as quickly as possible. We started with all the interfaces of these different LLMs, and you know there are a lot. Cha chiPD clot, Gemini, you can also use Oma, you can use grog. You can use a lot of different interfaces, even hugging chat and much, much more. All of them work relatively similar. You always have a nice little chat interface. LLMs can basically do only two things. They can expand text or they can make text smaller. But this is big. You can use code, you can use normal text. You can make tables, and LLMs can also call tools. And tools can be, for example, a bit interpreter, a diffusion model, the Internet, and you can analyze data, make charts, and do a lot of cool stuff with these things. Maybe in the future, they become a complete new operating system, and by the way, LLMs can also talk to each other, and then we call them agents. And you also learned that LLMs are multi model. They can basically see, speak, and hear. Only get good output if you give good input. And I showed you the basics of prompt engineering. Please remember semantic association. You need to give context. You can do this via shot prompting, are roll prompting. You should structure your prompts, and there are some tips like, for example, think step by step. Besides that, we also have the chain of thought, the tree of thought, reverse prompt engineering, and much, much more. But I think for most people, this is overkill. This is not really needed. If you want to customize realm, you can totally do this. Easiest way is probably the system prompt. You can simply give some instructions. Then we have direct technology, so we can simply upload data and then hat CPD or every single other LLM can browse this data and simply react in a specific way. Of course, if you are a developer, you can do all of this also over the API. You can develop your own apps and you can do all of this also in your own applications. You can do function calling in your own applications. You can make complete agents with your own applications with tools like flow wise. You can create pictures inside of your own applications. You can use vision inside your own applications, you can do all of it. You have learned the basics of these LLMs. They can do a lot of things, and I think you should start. Simply use them because remember, you only learned if you change your behavior. Earning means same circumstances but different behavior. Maybe you did not know how to use LLMs, now you know it. You only learned if you do it. If you want to be a smart cookie, you can simply share this course because more people know always more the view people, so everybody can learn together. Thank you for that, and I'll see you in the next video because this was it for Llams now we start to create pictures with diffusion models. 12. The Diffusion Model Explained: Section is about diffusion models, and there are a lot of diffusion models out there. We have Dali, we have Imagen, we have stable diffusion. We have Sra. Ra makes videos. We have mid Journey and diffusion models can also make music and, of course, also audio. So basically what we do is, I want to show you the diffusion process in this video, and then we will dive deeper into some of the best diffusion models. So first, how diffusion models work, and we do this really easy and fast. So I have found a really, really nice article for medium. All I need is this picture right here. Let's assume we have a big, big computer and we train our computer on images on images like this. So we give the computer images, for example, of this beach and we describe it with a text. We give the computer the image, and we say maybe a beach with the blue ocean, blue sky. There's some green on the mountains and so on. We are really, really specific. After that, we add some noise to the picture, like you see here, but we still describe what's on the picture. So a beach, blue ocean, blue sky, and so on. More noise, same text, more noise, same text, more noise, same text until you get only noise. In this process, the computer learns how these pictures look like. This process he simply understands that the words that you gave the computer yield to this picture. So we can reverse this. If we have only noise, and we tell the computer a beach, blue sky, blue ocean. There are some green on the mountains and so on. The computer can reverse this and make out of the noise this picture. Of course, we don't do this with just one picture. We try to give the computer every picture that we can find. And there are, of course, different diffusion models. For example, there's also Adobe Firefly. Adobe Firefly is trained on pictures of Adobe stock. Stable diffusion is open source and it's free. Everybody can use it. And stable diffusion was trained on pictures from the Internet. And because of this, we also can create nearly everything that is on the Internet. We can create even celebrities. We can create NSAfeF work stuff, and so on. Stable diffusion is not restricted. Nearly everything that is in the Internet, we can create with stable diffusion if we give the right prompts. The prompts are the descriptions that we give the computer to make our picture. And for that instance, it's really, really important to make good prompts because we need good pictures. If we are not specific, we can create a pictures that look like this. If we simply tell maybe a beach, we will get a random beach. If we tell him a beach, blue ocean, blue sky, and so on, we will get exactly this picture. A quick illustration of this process because some people like this illustration, I use this a lot. Just imagine you lay down on the ground and you look in the sky. Beside you is your girlfriend or your boyfriend or whoever you want. And she tells to you, Can you see this cloud? It looks a little bit like an apple, but you don't get it. You don't see the apple. But then she tells you, of course, just look, here is the apple, and then you start to understand you see the cloud, and now your eyes see an apple because your brain is trained on apples. Your brain most likely knows how apple look like, and then you see the apple in the cloud. Even if there's no apple there. And if your girlfriend doesn't say it's maybe a green apple, maybe you think of a red apple, and that's exactly why we need to use good prompt engineering. Because if we don't are specific, we will get random pictures. If you want to have a green apple, you need to tell the computer that you want to have a green apple. Just like your girlfriend need to tell you that the apple in the clouds is green. If she doesn't tell you that, maybe you'll think of a red apple, maybe of a green apple, maybe even a yellow apple you doesn't know, so you need to be specific. So in this video, we took a quick look at the diffusion model. The diffusion model works simple. It's trained on pictures and on text. Then noise gets added. The computer learns in this process how this picture looks like. And if we give the computer text afterwards, it can create these pictures because it will randomly select the pixels that are right for our picture. I hope this makes sense for you. 13. Prompt Engineering for Diffusion Models: Starting with DALL E: This video, we start to use our first diffusion model, and we want to start with Dali because Dali is the easiest to use. Dali works inside of JathPT so we already know the interface and the prompts are really easy to write because hatchPT helps you. So the LLM will help you create better prompts. The first thing that you can do is, of course, to simply go into JathPT. You can work with the normal multimodel JathPTO you can explore GPT and you can search for Dali. If you go on buy hatchPT, you can press on Dali and here you can start the chat. And here you can create your pictures. Can either add here stuff for your prompts and you can also use different aspect ratio. Let's just use wide screen. And now I just want to start with a really simple prompt. I just want to type in CAT. We leave white aspect ratio, and we send it out, and then we will get our first picture back. And there we have it. Here are our first two pictures. Now, if you press on this picture, you can see exactly what prompt yielded to this result. So if you press on these right here, this is the prompt. A beautifully detailed white spec image offers a rain cat sitting by a window with soft sunlight and so on. So you see the prompt is really detailed, and I want to show you how we need to write prompts for this diffusion model. Remember, in Dali, it's so easy because Chachi BT helps you write such beautiful prompts, and then it's really no magic to create good pictures. Dali is not the best diffusion model, but it's the easiest to use. If you want to write good prompts on your own, you should take a look at these. You need to include subject, medium, environment, lightning, color, mood, and composition. What all of this is meaning. So you can make pictures of persons, of animals, of characters, locations, objects, and so on. The medium could be a photo and illustration or something else. The environment could be outdoors on the moon or somewhere else. The lightning could be studio lights, neon lights or something else. The colors can be vibrant, colorful, black and white, and so on. The mood so the cat could be, for example, calm or peaceful or something like that. And the composition could be, for example, a full body view. So make sure to include these things. You do not have to include these things, but if you do not include it, the pictures will be more random. So you can get a photo or an illustration. If you do not see it specifically, everything can happen. There are also bigger prompting guides. And you can include stuff like subject, actions, environment options, color, style, mood, lightning, perspective, or viewpoint textures, time period, cultural elements, emotions, medium, clothing, text, and so on. This is a gigantic prompting guide. I just want to leave you with these so you can read it yourself. But if you want to do it fast, just think about the things because these things matter most. An example that could work is something like this. An illustration of a cat relaxed in a city in vibrant colors, full body view at golden hour with a 16 to ninpec ratio. So if we simply copy these, we can throw it into the Ali. So back into the Ali, we include it, and then we get a specific output. And even here, ChatBT will help you to create even better prompts. But this is a prompt that works in every single diffusion model. The prompting techniques they work every time the same. And here you see right now we have a really specific picture. We have exactly the picture that we wanted to have. And if you click on it and go on the prompt, you see the JetPD make your prompt even better. You can make prompts even better by including some magic words. For example, cinematic film grain, ultra realistic, dramatic lightning. You can use different shots and camera lenses if you want the point of view, the drone shot, and so on. Can use cameras with cinematic look. You can use different filmmakers. You can use Genres. You can use keywords for movements, for example, action scene. You can use different photographers, for example, sport photographers. You can use cameras with action scenes, for example, the canon EOS, one D X, Mark two. You can use all these different lightning so bright lights, warm, cold, low key lightning, and so on. You can use the gold ener, and you can use all of these different emotions. So make sure to include what you want to see. This is the most important thing because all of these diffusion models are trained on pictures with detailed descriptions, and if you make a detailed description, you also get back what you want. If you just type in cat, the cat could be random. And now I want to show you once again these diffusion prompts. I hope you know how we make this. This helps with prompt engineering. If we type in stake here, we will get a detailed prompt for a stake, and you already know how this works. If I simply copy these, of course, I can throw it into the Dali interface, and then I will get back a picture at the school. So let's just throw these in here. The Spec ratio is right now one by one. This is the devolt settings, and this prompt will work really good because we have trained such a GPT. You already know how to train such a GPT, and now I want to show you the training data. But first, let's just take a look at the stake. The stage is really good because we also include cameras with camera lenses and so on. If we go on to the diffusion prompts, I simply tell into the instructions that this GPD needs to make good prompts. And then I upload this document, and this document is a complete structure, how the LLM should structure these prompts. My training data looks something like this. The prompt structure a medium of subject with the characteristics relation to background, then the background, the details of the background, interactions with color and lightning, and then take on or drawn with specific traits of style. I give some descriptions, then some examples that I like. And lastly, of course, I include all the nice little keywords that make these pictures better. You can just use my GPT if you do not have the time to train your own GPD, and I will simply link this GPT to you. So you can make really good prompts really fast. So in this video, you have learned how to use any diffusion model. It's important to write good prompt, and a good prompt should be specific with theme, medium, setting, lightning, color, mood, composition, and eventually also the spect ratio. And if you do not want to write these prompts yourself, you can use ALE, and hechPT will help you automatically. And if you want to write really good prompts also for every single other diffusion model, you can simply use MGPD and get better outputs. And in the next video, I want to show you the basics of Maturne. ALE is the easiest to use, and Mahoney can do a lot more stuff. And I would strongly recommend you to make your first picture in DLI right now because you learn most by doing. 14. Midjourney Basics: This video, I want to talk about Mi hourney. In my mind, Mi Journey is one of the best diffusion models, especially if you want to make realistic pictures. The first thing that you need to do is to go on their webpage. Right now at this Minish, you can try this out completely for free. I think you can make roughly 30 pictures for free on their webpage. You need to go to mimichourny.com, and then you create your account. You can simply log in yourself with Google. As soon as you have created your 30 pictures, most likely, you need to upgrade your plan. It costs you, I think, nine bucks a month. If you are on Explore, you can see what other people are making, and you see the pictures, they look really good. You can also go on the search and search, for example, for dogs, and then you can find some pictures about dogs. The next thing is that you can search for hot for top daily, and for likes, and then you can simply find for yourself what you like. If you want to create something, you should go over to create. Here are the pictures that you have already created. Most likely, you have none. And if you want to create new pictures, you need to type in your prompt right here. So you simply type in what you want to see. I just want to run with this prompt here. Christmas deer head with pink, bow and Christmas wrath. Pastel watercolor on white background in the style and so on. The next thing that you can do is to press here, and here you have some settings. So you can make this in the aspect ratio that you like. Let's just say one by one or 16 by nine because we can see it a little bit better in a course. Then you have the mode. You can use the standard or the raw mode. The raw mode is better for realistic stuff. You can use different versions. Normally, we always use the newest ones, so for example, 6.1 at this minute. This is personalized, so if you already have created a lot of pictures, you can adapt your style. Then you have stylization, and if you do not know what this means, just go with the mouse over it. Mid churney can add a specific mid journey style, and if you increase it, you have more style. Wildness can make you generate unexpected results and the variety in your grid. So you create four pictures, and if you go up with this variety, these pictures will vary a little bit in your grid. Then you have fast and Durbo just leave it at fast, and then we create our first picture. If we send this out, we can create this. And while this is creating, I want to show you the seat because the set is always the first starting point of every single picture. If we press on these and type in dash seat. We can use a random seat, for example, this right here. And now we will get two different pictures. This picture will not be completely the same as this picture, but if I do this once again and also use once again the same seat, we will recreate exactly the same picture once again. Let me just show you for a quick moment because the seat is important if you want to create character consistency. So if you go down here, these are the first four pictures. This Christmas deers are nice. Now are the second four, and you see they are not completely the same as the first ones. So you see we are a little bit closer. Generally, they are similar but not the same ones. But now if we go up here, you see that we have exactly the same pictures as here. So this is the same picture as this picture because we have used the same seed. So if you want to have character consistency, you can work with the seeds, and then you can maybe tweak the prompt just a tiny bit, and you have always really similar styles. So remember, the seed is important. This is basically the first thing that you can do. And if you do not like one of these pictures, you can also edit them. If you press on these pictures, you see that you have a lot of different options that you can do here. Here you can make small or strong variations. By pressing on it, it goes automatically. Then you can make an upscaling. You can make a zop dial or a creative upscaling and the resolution gets bigger. So let's just press on upscale. Then you can also remix it. And if you don't understand it, just go with the mouse over it. If you press subdile or strong, you can simply tweak your prompt and make it a little bit different. But right now, I do not want to do this. The next thing is pan, Zoom, and here you have also more. But before I show you this, I want to show youvy upscale. If I close this down and go back to create you see that this right here are the first variations. So you see we have this picture, and now we have four different variations of this picture that are really, really similar, but a tiny bit different. Sometimes a little bit more of these red things, sometimes a little bit less. So you see these are just small variations. And here, this right now is the upscaling. So we made a small picture in bigger resolution. If you press on this or if you would download it, this simply has the higher resolution if you zoom in a lot. So you see the resolution here is really, really good. Compared to the first one, it's a lot better, so you see it is more clear. So it makes simply the resolution a little bit bigger. Then we have pan and Zoom. I do not like this anymore because right now we have on more the editor. And if you press on this editor, you can edit this picture. And here you can do the same thing as with the Pan and Zoom. You can simply do this right here, for example, and then you press submit, and now Mick Cherney will do the out painting and paints also here new pixels in it. But you can also do more. You can edit also with the inpainting. Let's just say that you do not like this right here. You can simply delete it and then make your prompt a little bit different. So we do not want to have the pink pow. So we press submit, and then we will get an in painting without the pink pow. Let's just go on create and then you can see what happens. So here are the first four generations, so you see we have simply generated a few new pixels. This was also not perfect, but yeah, come on. At least the picture got bigger. By the way, I think I like this one. That's not that great. Yeah, they are okay. And here are the next ones without the pink pal. So this is how you can edit your pictures. If you go on organized, you have a lot of different folders that you can make just to make it a little bit clearer. If you go on personalized, like I said, you can like different pictures, and then you can adapt your specific style. If you go on edit, I think not everybody has this right now. I think you need to be a long time on this web page in order to get it. Maybe as soon as you see the course, you also have this. You can simply upload an image from your computer and you can do the in painting completely the same. So just press on this, and now I just want to upload this picture right here, and let's just say I want to have a green hat. If I delete this, I can type in in the prompt, what I want to see guy with green then we send it out and we will get the green head most likely. We will also create right here the background, at least how I see it because this picture had not a background. So you can edit your own pictures really, really fast. And there we come on, this is a mess. But maybe the next one is better. Yeah, this is a lot better. Also, this works. Yeah, come on. These things are cool. The first one is a little bit of a mess, but the second, the third, and the fourth one, they are relatively okay. So you can also edit your own pictures, and also here, you can do the out painting. Let's just say you want to have different resolution. You can simply press Submit, edit, and then you will get your new picture, and you recreate the pixels down here. And, boom, there we have it four completely new pictures. Some of them are good, some of them are not really that great. And by the way, if you do not like a picture that much, of course, you can simply go in and edit it with the inpainting. So let's just say this was not perfect, and maybe also this was not perfect, you can edit it. I think you get what I mean. Next thing that you can do as soon as you have created such a picture or as soon as you have edited it with the arrays or with whatever, is that you can also do re texture. If you press on re texture here, so this is right now no longer the edit, but the re texture. You can change this picture a little bit. You can make similar pictures. This works similar to stable diffusion. Stable diffusion calls this control nets. And here Matron also tells you what happens. Re texture will change the contents of the input image while trying to preserve the original structure. For good results, avoiding using prompts that are incompatible with the general structure of the image. So what we could do here right now is, for example, that we type in guy with green head or just guy with heat, and we also type in cyberpunk. Then we simply press submit re texture, and then we will get something that looks somehow similar. So we will have a similar pose, similar compositions, but in a cyberpunk style. I hope you can see how this works out. This is really a cool feature. Until now, this was possibly in stable diffusion with the so called control nets. And now we can also do this in mid journey. So remember, with the edit, you can simply edit all your pictures, and with the redexture, you can redexture them. You can use stuff that is called control net and stable diffusion also in mid journey. Here, you don't have that much control, but this is also a nice feature. That is basically everything that you need to know inside of M journey if you want to create really fast. Yes, the tool is a lot bigger, but if you just want to start as fast as possible, this is everything that you need to know. You can create pictures, you can edit pictures. You can use different seeds to recreate the same style over and over again. Have fun in Mjourney like I said, as fast as possible. 15. Ideogram and Adobe Firefly: This video I want to give you an overview of two diusion models. We have ideogram, and we have Adobe Vrefly. These are also two completely separate divusion models. Adobe Firefly comes from, like Adobe, and it's also integrated into Photoshop and so on. I think Adobe is special in that manner because you can create pictures, and Adobe only trains on pictures from Adobe Stock. So you do not have to worry about copyrights and so on. This is special because Money and so on, they can create pictures from beepers or also from companies, and sometimes you can get copyright claims. But if you use Adobe Firefly, this is not the case. And the ideogram is special because it's really good with text. So as soon as you go on one of these web pages, this right here is ideogram, I am in the free plan. So no, I also do not pay for every single model under the sun. And here you have a really clean interface. You have home, and here you can type in what you want to see. The prompt engineering always works the same. Here you have all realistic design, three D and anime, and you can simply look for yourself for what things you like. If you use ideogram, I would strongly recommend you to create pictures, for example, like these. Pictures where text is included because here, ideogram is really good. Let's just make a test. A fox that holds a sign with the letters, catch me if you can and then we can simply make some adjustments. So the magic prompt, do we want to have it on out or off. If you leave it at on, your prompt gets automatically enhanced. Then the specs ratio, the visibility, you can only go private if you pay, then the model and the color palette if you want. But at this minute, I just want to send this out. There we have our four pictures. If I press on them, yes, this took a little bit of time right now because they can only generate slow if you do not have a plan. But you see the text is really good. Catch me if you can. The text is perfect. As the fox is somehow good. Then let's just see the next one. Where is it? This right here, catch me if you can. The fox is really nice. So I really like this prompt or this picture. This one is also relatively good, but this sign is floating a little bit around, so I like this one a little bit more. And this is the last one, catch me if you can. Also, this is really good. So basically, just go into this program and play a little bit for yourself, especially if you want to render text. This is really great. Here is also something that I like. Logos and so on are completely perfect. There's a picture that I like, so play with this a little bit. If you go on creations, you can see what you have created. So basically, there are some pictures that I have made. And if you go on Canvas, you can also edit your stuff similar then into Murne. This is basically everything that you need to know about Ideogram Idogram is really, really easy to use. The next thing is Adobe Firefly. Adobe Firefly works similar. Here you also have generative film, text to image, generative extent, and generate videos. Videos at this moment do not work. Here you need to join the wait list. But you can absolutely create and edit with Firefly. If you press on these right here, you are on their Firefly webpage. And if you go back once again, you see what things that you can do. You can do text to image, generative film, generate a template, generate a vector. So if you use Adobe Illustrator, you can also generate vectors, generative recolors and text effects. You can play with all these things around. The interface is really easy. If you press on text toimage here you can simply try it out. You can also use the pictures that other people have made. Let's just say you like this one, if you press on it, this gets automatically copied. Down here, you can type in your prompt and you can try this prompt, and on the left side, you can use what you want. So let's just use Firefly three. I want to have the fast mode, it should be, for example, four by three. Then what is the content type? Is it art or photo? For example, art, then the compositions, you can also upload the reference pictures. If you want to upload reference pictures, then you can upload, for example, reference styles. So let's just say you want to have this reference picture, yeah, but for this prompt, it's really not perfect. So this would not work that great. So I put the strength down ar to zero, and then I want to have, for example, a style reference. Let's just say I want to have a little bit more neon, so I include the style reference. Then we can also include other popular effects. For example, the hyperrealistic effect, then the color and tone. Let's just say warm. Then the lightning, studio lights, the camera angle, let's just say white angle, and then you can press try prompt. And yes, this prompt is right now a complete mess, but I hope you get what I mean. These settings are really easy to use, and still we have impressive pictures. Yeah, come on. I really like this tiger here, so you can absolutely play with these things around a little bit. If you like your picture, of course, you can download. And the next thing is, of course, that you can also edit your pictures. You can either edit these pictures here if you simply press here on edit, or you can also edit your own pictures. If we go back once again and press on generative fill, you can upload your pictures here or you can edit the pictures that are already included. Let's just say you want to edit this picture. If you press on it, you can edit however you want. You can either insert, remove or expand. If you press on Expand, you can make these pictures bigger. If you simply press generate, the biv light will simply do the out painting and includes here something. Then you need to see what works for you. Let's just say I want to have this and I press keep. The next thing, I want to remove something, for example. Let's just say I do not want to have this funny thing here because I have no clue what this is. I can simply remove it, and then it should go away. And, bam, there it is. I want to keep it because I think this is nice. The next thing is insert. Let's just insert something here. Let's just say I want to insert the tiger, for example. So tiger, we press generate, and then we can insert different things here. If you want to edit, for example, Bebor so this works. You can change clothes. You can change hair colors. You can change whatever you want. Yes, this tiger is a mess. Come on, let's just keep it. I want to show you one more thing with a human. So let's just say I want to add it to this right here. I want to do the insert, and I want the cheese wearing, for example, different clothings. I can simply copy these clothes right here, and then I can type in what I really want to see. Let's just type in, for example, Jacket. And there we have it, and I think this turned out somehow okay. Let's just keep the first one. None of this is completely perfect. Adobe Firefly, this is a tool that I don't use a lot, but some people really like it. It's especially powerful if you already work with Adobe Photoshop because here it is included. If you work with Illustrator and Photoshop and so on, you should totally work with Adobe Firefly. So this was basically ideogram. Use ideogram if you want to generate text inside of pictures. And Adobe Firefly, I would personally say, use it if you use already the Adobe product, so Illustrator and the Adobe Photoshop, or if you want to be 100% certain that you never ever infringe copyright because firefly is trained on Adobe stock. So try these two tools out. And, of course, the prompt engineering is always the same. See you in the next video. 16. Open Source Models: Talk about open source difusion models. Mainly, it's stable difusion and flux, but there are also other models like recraft and Omnigen and much more. This topic is gigantic, and you have the most flexibility. You can either download these models and run them locally on your own machine or you can also run them in the Cloud. The easiest and fastest way is to run them in the Cloud. But nonetheless, I want to show you some free options so that you can also run them completely for free and not to pay for every single feature under the sun. So the first option would be CFY. Now, science you do not have a lot of time in this course, it's maybe not the best option. The learning curve is really steep. This is CFI. I have a course that covers this in detail, but CFY is not the thing that works really fast. The second option is, for example, web UI Forge. This runs relatively easy, relatively fast, but also here, you have to download a lot of stuff. So it's also not that great. With Forge, you can also run stable difusion flux and much, much more. What I want to show you right now is focus because with focus, you can run stable difusion, and stable difusion is open source, and you can run it for free. Either in a CLP notebook or you can install it locally. If you want to install it locally, you can simply do it via this link. So this right here, and then you can run it locally. But what I want to show you right now is the fastest way, and this is simply this CLP notebook. So opening C and then you can run this called notebook by simply pressing on play, and then we will get a radio link with a nice interface, and here we can run stable ifusion. I want to show you how this works. Then I want to show you Leonardo, and then I want to show you lax. We do this fast. After a while you get this link, run on public URL, and we press on this link right here. Then a grado phase will open up. And here you have a lot of options. The first thing is that you can press on Advanced, and here you have a lot of settings. If you want to start fast, just leave here initial, use speed. Number of images, let's just say one. Here, we have the special sauce and stable diffusion that we also have a negative prompt. You can type in what you do not want to see. For example, ugly and blurry or also colors, let's just say red. We do not want to have red in our prompt, and then we type in what we want to see. Let's just say Instagram model. And if we press generate, we will create our first picture, and we will have an Instagram model, and it will not be an ugly picture. So this is the picture quality and not the Instagram model that we create. It will not be blurry and it will not be red, so red is most likely not included. And there we have it like normal brown hair. We have a nice picture, and the generation is also somehow okay. Come on. We use our free cooled notebook. We can use this forever for free, and I think this is cool. There we have our picture. The quality is really good. Then the next thing, you can press on styles. Here you can type in the styles that you want to see. For example, a side three D model. If you press on this and if you type in CAT, for example, let's just say CAT. You will create a CAT and it will look somehow like this. I have also included a sharp and focus version two. So we will also mix in a little bit of photoalisms. If we decrease this weights here and only use the si three D model, it will be a little bit more in this. So why stop this, for example, and I create once again only with this si three D model and then it should work better. For the next pictures, I can include, for example, the other styles once again. And I just stop this right now. The next thing is models. You can also use different models and different auras, but most likely if you just want to use this fast, you don't need to do a deep dive in models and Las. And the advanced settings, most likely, you do not need them. But what you eventually need is enhance. If you press on enhance, you can make small variations, and you can also do upscalings completely the same as in mid journey. And what you can also do is to press on input image. Here you can upload images, and also here you can make upscalings. Let's just make once again a realistic cat. Let's just type in cat here. Yes, I do a really bad example here with the prompt engineering. I just want to make a cat, and then I want to show you what we can do down here. And there we have it right now we have our cat. And if we throw this down, we can make variations. So either subtle or strong and if you press on Zu dial, you can also type in, for example, happy, and you get a happy cat. You simply can press Create, and then everything will change just a tiny little bit, and maybe the cat tries to smile. Yeah, let's just see how this works out. This works a little bit better with people if you include this. Yeah, come on, maybe it looks a little bit more happy. It works better if you do this with humans and if you type in smile, for example, or with colors, with this cat, you could change the colors just a tiny bit. So with these variations, you can play with them. You can also make upscaling, so you can make upscaling in two weeks, the resolution, press on this and then press generate. Let's just see. Yeah, come on, it looks a little bit more happy, at least how I see it. Then what you have is image prompt, and this is especially cool because you can press on Advanced, and then you can upload here your things and you can use Image prompt, Ba kenney, CPDS and face swap. Let me explain you how this works. If you include this right here and you use Image prompt, you can also type in, for example, do and if you press Create, the first few frames will be completely the same frames as this right here, so we can use the style of this picture. So just see for yourself the style is really, really similar than the style from the previous generation because we use the input image with the image prompt. So we have a really similar style than in this picture. I hope you can already see. And there it is. We have a really similar style, so you see the green background, similar lightning, similar colors, and so on. The next thing that you can do is Piracani or CPDs. These two things are called control nets, similar to the previous mid journey video. If we type in, for example, Dier right now, we will use a Pyraky. We will use a control net that controls the depth or the poses of these images. Basically, we will create a dire that is in a similar post to this kitten here. It will sit most likely somehow, and it will be a really, really similar post in this right here. Be also the tail will be completely similar. Also the ears will be really similar, but we should get a tiger. Just see for yourself, we have the same compositions, but you see we create a tiger right now. Yeah, this will get cute, I think. A small little tiger that sits completely similar than our kitten, but the frames will be recreated with a tiger. And after 50%, the frames can also take over a little bit more, and it also changed a little bit. So right now, you see it gets more and more and more like a tiger and less than our kitten. And if you want to have even more kitten in it or even a more similar pose, you need to play a little bit with these control nets. You see, like the pose was not perfect. It is similar, but it is not perfect. What you can do is to increase the weight a little and the stop bet. If we increase the stop bet, for example, at 0.8, we will use 80% of the steps from the generation in order to recreate this kitten, so it should be a lot more similar. You see it right now, it's really like the kitten, but a little bit to different colors for the tiger. And this will go on right now until 80% of the frames, and just the last frames will take over a little bit more. Let's just see if this works or not. Like I said, you need to play with these. So I think this picture get messed up because we also add this thing here. Yeah, this is not perfect. We need to play with these things. I tried it once again, and I think this right now is a little bit better. We have a really similar pose right now. So these control nets allow you to use the pose. This is especially powerful if you have, for example, humans that are in a specific pose. If you have a ballerina that does something fancy, you can recreate with this Bacani something that looks really similar. The next thing is face swap. You can upload, for example, a picture from your face and simply swap it. And you can also combine more of these things. You can use, for example, Bakani from a ballerina, and then the face swap from another human and then maybe something else as the style reference. So you can play with this a little bit around. The next thing is the inpainting. You already know how this works. You can simply throw this down, and let's just say that we do not want to have this tail here. We can simply do the in painting. Now the painting in focus with stable diffusion is really big. Here we can do a lot of things. But generally speaking, if you just want to work fast, work just like in mid journey. This is a gigantic tool. We cannot go over every single detail. The next thing is describe. If you use describe, for example, for this prompt and press describe this image into prompt, we will get the prompt. You can also upload images that you have on your computer, for example, and then you can see what a prompt could look like here. This is the prompt that the diffusion model Z. An orange digger stands on some rocks. So, come on, this is. Then we have an hand. You already know we can make upscalings and so on and the metadata. If you include this picture, for example, you can also apply metadata, and this metadata is especially powerful if you include it or if other people are including it, then you can use their settings. The next thing that I want to show you is the logs. If you press on settings, you can go on to the history logs. And here you can see what you have created previously. You can see all your creations and you see what resolution was prompt and what settings got you to this result. This is basically the fastest way to explain your focus. So focus is a gigantic tool. Stable diffusion works in the background, you can use it forever for free. If you want to use a web interface for stable diffusion, you can use leonardo.ai. Leonardo.ai is also one of my favorite tools if you want to work in a web interface. And here you have basically the same things as in focus. It is also a little bit easier to use, but don't worry about every single tool under the sun in Leonardo AI, you also need to pay relatively fast. Also here you have, for example, Canvas. You have the real time generations, you have motion, you have image creation, you have upscalers, you have canvas. You can train your own models, and you have three D texture generation. So a lot of control in LeonardoEI they have also some small tutorials how to use all their tools. So just take a look at these if you want to dive deeper and also let me know if I should include a separate lecture. But normally like we want to do it as fast as possible, and I think you should work with focus if you want to use stable diffusion as fast as possible. Now, if you want to use flux and the different other difusion models, you should go on replicate. Replicate is not for free. Here, you need to sign in with Github. So yes, these open source tools, they can get a little bit overwhelming at diverse glen but as soon as you get it, they work also really fast. Here you can use the lux models, you can use re craft. You can use every single model under the sun. Stable diffusion 3.5 large. There are a lot of really good models. If you press on these models, they are really easy to use. You can simply type in in the left what you want to see, and on the right side, you get your output. So this looks really realistic. Something that works really good in flux is also text. Let's just say a woman holding a sign with the letters, I am not real. And then we press Run but attention, this costs you, I think, $0.06. Yes, $0.06 per generation and you need to connect your Geta profile. Here you can see some pictures that were created with this model. So this model works really good and just wait for this output because also the text is rendered stunningly good. I am not real, and this is a perfect picture. In this video, we took a look at the open source diffusion models. We have stable difusion. We have flux, we have recraft. We have a lot of different things. We can run it also with a lot of different options. We can download them and run them locally with, for example, CFI or Forge. One of the easiest ways focus inside of Google Colab because you can press Play on one button and use it for free forever. And if you want to work over an API, use replicate, and here you can use every single diffusion model under the sun that is open source and has an API, but here you need to pay a little bit. So you can play with this around just for a tiny bit. I would guess that you should stick to focus if you want to create fast. See you in the next one. 17. Recap of Picture Generation with Diffusion Models: This section, we have learned how we can use normal standard divusion models to generate pictures. You have learned how they work, computer trained on text and picture. In that process, the computer learns how to generate this picture, and then you can recreate it, and you need to have good prompts for good outputs. You need to be specific. We have a lot of different divusion models Dali, McTerny, ID gramatob Viavly, table diffusion, flux, re craft, and much, much more. But all of them work relatively similar. You always need good prompts, and you have learned how to write them, and you also so that you can edit your pictures within painting and out painting. Right now we want to tell you. Learning is same circumstances but new behavior. So basically, until now you maybe did not know how to use these diffusion models, now you know, so you should totally use them. Make some pictures for your marketing, for YouTube thumbnails, for presentations, for ads, for whatever you want. Only then you have learned. Or you have just some fun creating these pictures. I also want to tell you what good learners do they learn together because more people always know more than people. So if you could share this course, this would really mean the word to me. Maybe it also means the word to the other person, and if the other person gets value, they describe the value to you because you have told them. So thank you for that. And I see you in the next section because diffusion models can do a lot more. They can make audio. They can make entire songs, and they can make videos. So see you in the next section. 18. Ai Videos with Kling AI: Yes, AI can also make videos, and we have a gazillion different tools. We have BCA labs, we have runway, we have hotshot. We have dream machine from Lumaabs. We have SRA from Open AI. Yes, SRA does not work right now, and we have Kling AI. Of course, there is a lot more, and all of these tools, they work relatively similar. If you go on PCabs, they have something special here, so you can also create these videos that you saw going viral sometimes. These videos right here where stuff is melting. So they got viral on social media from time to time, and the BCA, you can create them. In way, you have also a lot of flexibility. You can simply log yourself in and create all of these videos, and you can also see their own tutorials. Hot Shot works really easy. You simply type in text and you get video back. In the dream machine from uma Labs, you have basically the same thing. We always also start and end frame in most of these tools. And I think right now at this minute Kling is also one of the best things here. You have AI images, AI videos, video editor, and so on, and that's why I just want to show you ling AI because like I said, right now at this minute, King AI gives you really good results, and you can start completely for free. That's at least in my mind, the coolest part of it all. Most of this stuff works for free. Most of these AI video generators, they work relatively similar, so I just want to show you Kling AI, and if you really want, you can play with the other tools for yourself. The first thing that you need to do is, of course, to go on cling.com. This is our Chinese web page, but they have also their English version, and here you can do a lot. If you go on home, of course, you can see the overview. You can see the best shots from the videos. Here they have generations where they have also included sound. Am I dreaming? I am so tired. So if you take your time, you can really make cool generations. These are all short films. You can simply look at them for yourself. They are stunning. Then you see the best creatives. These are just pictures. You can see that they make also really nice pictures here. This is also something that I like, for example. So you can create videos, you can make short films, if you clip some things together and you can work really nice. You can make AI images and AI videos. If you press on AI images, you can simply create images. I have to tell you I do not love this feature inside of link because for AI images, I think midtone stable diffusion and so on is a bit better than cling. So don't waste your time with AI images inside of Kling. But what you should do is to press on AI videos because with AI videos, you can do really a lot. You can type in a prompt. You can increase or decrease the creativity. Then you can use the mode that you want to use. If you use the professional mode, you need to have upgrade to the premium feature. You see simply the quality gets a little bit better. I had here the premium plan, but right now I do not have it. Then you can use five or ten second generations. You can use different aspect ratios and the number of generations. Lastly, you can also use camera controls and the negative prompt just like in stable diffusion. But the negative prompt is also optional. So let's just try this out. Let's just use one prompt here. And of course, they have the best practices if you want to dive deeper into the prompt engineering specifically for ling. But generally speaking, you should always just use the same prompting techniques that you already know. So subject with the movements, the scene, the scene description, the camera language, and the lightning atmosphere. And here they give you a detailed description how you can write such a prompt. Here they give you some examples. This is a classic prompt, then this is a prompt that you made a lot better, and here they have a really, really descriptive prompt. And down here, you see what changes in these videos. If you press on these, you see that generally speaking, you got a good video, but of course, the better prompt yielded even better results. Let's just look at these. You see you have a few more effects, and I think the video is generally a little bit better. And if you have a really descriptive prompt, you see that it gets even a little bit more impressive. What you can do is, of course, to simply copy this prompt and then throw it into your application and see for yourself how these things are working. Here they show you a lot of different examples with a lot of different prompts. Like, there is no point that I show you every single prompt here. You can simply look at this for yourself. It's really easy to use. Then if you go back to Kling, you can, of course, use either Kling 1.0 or Kling 1.5. If we go in 1.5, we have, generally speaking, a little bit better quality, but some features are not included, but they will. Let's just work with Kling 1.5. Include a good prompt, the creativity at medium, the standard mode, 5 seconds, 16 by nine, one video. I don't want to include any specific camera controls, but you can do it if you want to have horizontal vertical Zoom or some o, come on, let's just use the Zoom. And I just want to have a small Zoom here. And then a negative prompt, let's just use logo, watermark, blurry, ugly, and then we press generate and we pay ten credits here. All in all, we get, I think, like 100 credits a day, and then you can create this stuff. And while this is creating, you can also leave the page and do similar things in the meantime. So let's just do this. If you go on cling 1.5, you can do basically the same things here. But if you are in 1.5, some features are not there. If you scroll down here, the camera movements, they are disabled in 1.5, but I am sure they will come back. If you go once again vacuum in cling 1.0, they are included once again, of course. Then if you go on image to video, so this is text to video. If you go on image to video, you can throw up your images, and then you can mix them with a prompt. And you can also use this motion brush. I want to show you this motion brush immediately. You have also here creativity, standard mode, length, and so on, and also the camera movements, but they are right now disabled, and you have a negative prompt. So if you use, on the other hand, cling 1.5 right now at this minute, you do not have the camera movements right now included, and you do also not have the motion brush. So let's just use 1.0, and then we upload the picture. It does not matter what picture you use. Let's just use something from my generations. I just want to upload this right here. So we can simply animate this guy, and I want to do it really simple. Come on. A guy, docking. Then, of course, you can use draw motions with the motion brush. If you do not use it, this will be just a random creation. But if you use the draw motion on the other hand, you can simply tell the diffusion model how it should behave, and they also give you some instructions. You can use, for example, the area one, use Shrek, and then press some specific things that you want to use. You can either mark this for yourself, with a static area. Or you can also use, for example, the auto segmentation and press on the stuff that you want to animate. If you want to delete something, you can also delete. So you can do this however you want. It's important that you just mark the stuff that you want to utomate, not automate animate, of course. What I want to do right now is, of course, to add movements, and for that, I do not use static, but I use area one, the outdo segmentation, and I simply press on every single thing that should be not still this time. As soon as you have found out what you want to animate, so let's just say I want to animate right now this whole guy, like you can see it. What we can do is to press on track, and here we can now draw what this guy should do. So let's just say this guy should go in this direction and maybe a little bit then in this direction. So we can simply draw here something, and then you see how this is working. If you press confirm, this is okay. If you don't confirm it, just do it once again a little bit different. So let's just say you want to have him in this way. I think right now this is working, so we press confirm right now. And then we will animate this guy and this guy will simply walk into this direction as soon as we press generate, of course. In the meantime, we had our other video with the banda that is drinking coffee, reading a book that has also some glasses so you see you can make cool generations. Then this guy is doing and he is moving after it. Then if you go down, you have your motion path included. You have, of course, also the creativity and so on. Press generate. And then you will see that we can animate this picture with ease. By the way, you have also a motion brush user guide. If you press on it, they show you exactly how you can use this tool, and they give you also a lot of examples that you can take a look at. Here they have animated this ship. Let's just take a closer look. This ship, then it was marked where these things should move. So here, they used the brush tool to move the ship in this direction and the water in that direction. And this was the video. So you see it works really, really great. The animation is awesome because the ship moves in a different direction than the water. Get this cool effect that it would be windy on the water. The water moves in this direction, but still the ship can move into the other direction. The same thing is true here for these dogs. They have simply marked the dogs, and then they have told the dogs in what direction they should look. And if you press here, play, you see that the disc also turned out to be perfect. Let's just make it big. The dogs look exactly into the direction where you brush it. This thing with the apple is also great. They have simply marked the apple, as you can see down here, and they used the brush tool to move the apple downwards. You can see the output here. It worked great. And you see, we also have the water that is splashing. Let's just make this big. If you look closely, it is not 100% accurate, not 100% perfect, but this is a nice video. You can even make commercials with these videos. And here they have the cat and the cat is jumping over this thing here. Let's just take a look. Here you see that the cat is jumping. This also turned out to be really nice. Yeah, the landing was not perfect. She's not on point, but this can happen to a cat from time to time. Also, here, you have a lot of examples that you can use. Like you can make really stunning animations. You can brush here however you want. The next thing that I want to show you is, of course, that you can do even more in the meantime. So if you go on image to video, you can, for example, delete this guy here, and then you can also press at end frame at the end. So let's just do something really cool right now. I want to upload this picture. This is a mid journey picture. Then I press at end frame, and then I upload the next picture. So you see these two pictures, let me just open them up. This is here a girl, and I have recreated a girl with the same seat that is a little bit older. You already know the game how this works. So this is her a little bit older, and this is she a little bit younger. And now we want to transform her with a video. These videos they got viral from time to time. And here we can simply type in a woman aging, for example, we have the start frame, we have the end frame. Then we can not use the motion brush right now at this minute. But we have here every single other thing at the default settings, and then we can simply press generate once again and we will recreate something really, really cool. So here you can make a lot of generations one after another. In the meantime, I will show you some generations that I have made previously. So here you see, this was a really simple prompt. I think the prompt was a small dog is lying on a cat. Here you see a beret that dances in the jungle. Here I used, for example, a picture from flux, and I have simply made her dog. You see this works really, really nice. There are a lot of posts on eggs that got viral that did something like this. Here I did the same thing, and the second generation turned out to be even better. This really looks like real generations. The only thing that is messed up here is this hand a little bit. In the first generation, also the hand is messed up a little bit. Here I have made something with, like, a landscape, and then we go into another picture. This is start and end frame. So you see basically we can move here around. Then this is our panda that I have generated. This panda is right now simply reading, and then we get our new generations, and I will show you them as soon as they are done because this is done like in a few seconds right now. One of the generations is done, and surprisingly, it's this one, the thing that we started later. And here you can see how she is getting older. You see that this works really, really nice. She starts out young, and then she transform into this older version. These are these videos that got viral sometimes on Twitter, and you can recreate them right now if you want. Yes, sometimes it does not turn out to be perfect. But if you play a little bit with these, you can totally shoot for these. And that's basically every single thing that you can create. I will blend the next thing in as soon as this is generated. So basically, this is how you can work with Kling AI. You can simply make an account, and then you can start for free, at least right now. You can dipe in text and get video, and you have a lot of control, and they also tell you how you can write your prompts. The next thing is, of course, that you can also images to videos. You can simply upload an image, and you can also transform it with this motion brush. You can mark it and you can simply tell the AI where this thing should go. And the next thing is that you can also include start and end frame. And with start and end frame, something like transformations is really, really cool. So please just give this tool a shot. I am convinced that you will find it cool. 19. Text to speach with ElevenLabs & more: That AI can make voices. Yes, I like that, too. So this is Aloy. This is text to speech from the OpmiPlayground, and you already know this. We have a lot of tools that can make text in voices, and we can do a lot more. This is one of the easiest tools. So on the OpmeiPlayground, you can simply type in what you want to hear, and then Opmeai will simply create this. There are also open source alternatives, for example, F five TTS, can install this locally, and if you want to test it quick, you can also make it work on this hugging phase space completely for free. You simply upload an audio and then you can type in the text that you want to generate and you will clone your voice. But I think one of the most powerful tools is 11 labs. Because in 11 labs, you have a lot of flexibility. You can also start for free and you have a lot of languages. Let me just show you this. The 11 labs voice generator can deliver high quality human like speech in 32 languages. Perfect for audio books, video voiceovers, commercials, and more. So you hear the voices are really, really good, and you can do a lot of stuff. And that's why I want to show you as quickly as possible what you can do inside of 11 Labs. I think if you want to start quick, 11 labs is the way to go, like because you can start for free, and later if you want to create a lot, you need to pay. But it's fast. The first thing that you do is to go on this web page and then you press GTA. Then you will be in app and of course, you need to register yourself. Just make an account with Google with whatever you like. The interface is really easy. You have here on the right side simple and advanced. First, we start with the simple interface. The first thing that you see here is that you can type in what you want to type in and then I can use different voices. This is a deep male voice by Arnie. I have created this voice myself. If I press generate speech, I think I like this tool. Then you see we can generate this speech, and this goes really, really fast. And if you like the output, you can download it by pressing on this button. And then if you go on history, you see the generations that you have made, and you can also simply download the generations. Yes, I have made a lot of stuff, so you see there are pages and pages and pages of generations. And you can also go back really, really fast and you can recreate these things really fast. If you go back on generate once again, you have most likely no voice that you have generated yourself. If you scroll down a little bit, you see that I have a big voice library. I have cloneed voices from Elan Mask from me for also from myself and also from Angela Merkel. And we have also some generated voices here that I have made, and then we have the default voices. Right now, at this minute, you have most likely just these default voices. But of course, I want to show you how you can clone these voices, even voices from yourself. So this is a voice that is, like, somehow like me. I think I like this tool, so let's just generate this with my voice. I think I like this tool. Yes, you see, even the English is better than mine. Maybe I should replace myself with a I. I'm sure we will get to this point. That's the point of all of this. Then the next thing is that you can also press on Advanced, and if you go on Advanced, you can use here different models. Here on settings, you see 11 multilingual version two, our most lifelike, emotionally rich mode in 29 languages, best for voiceovers, audiobooks, post production, or any other content creation needs. We have English, Japanese, Chinese, here is also German and a lot of voices, so this works really great. Besides that, you can also use different models if you really want to. You can simply switch here to different models. For example, the Turbo version 2.5, Di version two, D version one, and so on. These things get worse and worse and worse. The only thing that you can eventually do is the Turbo voices. Our high quality low latency model, so this is a little bit faster, but I just work with the normal one. Then you have stability, similarity, and the style exaggeration. You can play with these things, but generally speaking, the standard settings work really well. Then you can also include the speak booster if you want. If you mess with these too much and you simply press, for example, to default settings, of course, you will get your default settings back. I have to say to you, I normally don't mess a lot with these advanced settings here because the default settings work great. Then on the left side, you see that you can not only make text to speech, by the way, here, you can simply throw in whatever you want. You can throw in nearly entire books and you can make audio books out of these. T should also work completely for free. This is really awesome. We look at the pricing later because you can start for free. The next thing that you can do is you can go to voice changer, and the voice changer is really awesome. Here you can upload speech and you get speech back, but in a different voice. You can use, for example, let's just say deep male voice by Arnie. Now I can record myself or upload an audio and I can simply recreate this voice. So let's just try this out. I want to record here this audio. If I press here, I will start. This will be a test if the stool from 11 Labs is working in real time or not. I hope you don't let me down. Then we simply press generate speech. This will be a test if this tool from 11 Labs is working in real time or not. I hope you don't let me down. And you hear like even my stupid accent will get duplicated. But you see, we have a different voice. I can also make here like other voices like Adam is one of the legacy voices that works really, really great. We could also make myself talk like a woman and do stupid stuff with these and can also add other accents. The next thing that we can do is to press on voices. And here on voices, we can do a lot. You can go on all on personal, on community, and on default. At this minute, you will have most likely just the default ones, and you can always listen how these voices sound if you press play. Trust yourself, then you will know government of the people by the people. The world is round, and the place. There is no greater harm. So you hear there are great voices. If you press some community, you hear the voices that the community likes and voices that the community has created. For example, this. We have committed the golden rule to memory. Let us now commit it to exist is to change, to change is to mature. To mature is to go on creating. You can't blame gravity for falling in love. This is great stuff for you. Then you can go on personal. Here are the voices that you have created if you have created voices. If you do not have created voices, you can press on add new voice. And here you have either voice design, instant voice cloning, the voice library or professional voice cloning. If you press on voice design, you can simply type in what you want to see. Let's just say you're female, young, accent American accent strength. Yes, this is okay. And then you have an example how this would sound. And then you can press either use voice or first generate to hear how she is sounding. First, we thought the PC was a calculator. Then we found out how to turn numbers into letters and we thought it was a typewriter. It's okay, but let's just say you want to have a different accent. Let's say British and you want to have a strong accent. First, we thought the PC was a calculator. Then we found out how to turn numbers into letters, and we thought it was a typewriter. You see you can make this work however you want. You can also do male old, Australian, low accent, one last time. First, we thought the PC was a calculator. Then we found out how to turn numbers into letters and we thought it was a typewriter. And if you like it, you press use voices, and this will be in your voice library. If you do not like these, you can press once again here and do Instant voice cloning. If you press on this, you can give it a name like me, for example, then you would upload a few examples, and here they tell you what you can upload. No items uploaded yet. Upload audio samples of the voice you would like to clone. Sample quality is more important than quantity. Noisy samples may give you bad results. Providing more than 5 minutes of audio in total brings little improvements. So what I tell most of the people is to use roughly four to 8 minutes of really, really good and high quality audio. You can spread this over up to 25 samples. The only thing that is important is that the samples are not bigger than ten megabyte. So you can upload, for example, three tracks, every track can have, for example, two or 3 minutes with good audio quality, and then you get your voice. And then you can simply give a few labels if you want, add a small description, and then you need, of course, to accept that you do not do any stupid stuff with these voices. Then you press that voice and you are done. I have done this with my voice E and mask and with a lot more. The next thing that you can do is, of course, the voice library. You already know the library. So here you simply find stuff from other people. And the last thing that you can do is, of course, if you press once again on add new voices, professional voice cloning. For that, you need to pay a little bit more, and you can simply talk to 11 labs. You can send some sample voices, and then they create a voice that sounds really, really crisp. Most of the people do this if they want to clone their own voices and make entire audio books out of these. This works great. A friend of mine has done this, and he gets more streams with his cloneed voice than with his original voice. So like, you can do cool stuff with these. Then, of course, also here you find this library, and here you can find a lot of things. Let's just say you want to create stuff for social media. You can use a lot of different voices. Videos with eyes, YouTube, shorts, os, hedges and of course, these are also different languages. You can make a lot of cool stuff here. Besides that, you also have sound effects. So you can create sound effects for whatever you want. Let's just make dog barking. Here you get a few examples. Sounds great. My dog right now is not here. Normally, he's always around, but this would sound like nearly like him. So you can simply type in whatever you want to create, press on it, and, yes, you can use this stuff commercionally. Then if you go and explore, you find, of course, voices that other people have made. So you can find a lot of stuff here. Here you see the weekly topics. This is something cool, for example. And you can also sound hear what you want to hear, and they have also categories. If you press on animals, you will find a lot of animals cat meow, birds singing, frog, and so on. And you can always use just the prompt or also download this stuff if you like. Then you can also use like booms or brams or do whatever you want. You can make really good sound effects with these and like I said, you can use them commercially. The next thing that I want to show you is project because you can make entire project. To explain you this really, really fast, I want to show you this video because this is a feature where you need to pay a little bit more. I have the basic plan, but if you want to do a lot of stuff here inside of this tool, you need to have the stronger subscription. I want to show you the subscription at the end of the video. Introducing project, your enter end workflow for crafting audiobooks in minutes. Whether you're starting from scratch, pulling from a URL or uploading EPUB, PDF or TXT files, projects has you covered. With your text in place, you can convert everything to audio with the click of a button. If you want to mix up voices in your audio, you can now easily assign particular speakers to different text fragments. Chapter one, the bus stop. Hey, do you know when the next bus is? Matteo asked. I think it should be here now. If you need to fix a section, projects lets you seamlessly regenerate. So basically, you can make entire projects with different speakers and do a lot more. If you have more interest, you can watch this video yourself. But then you need, of course, a better plan for this. I want to show you this right now because I get some questions from time to time. You have a lot of different plans. I am right now in this current starter plan, and this is cheap. I pay, I think, like five bucks a month, but you can use more. So the free plan, you can play a little bit. With the $5 a month plan, you can play a little bit more. And then with the creator plan, this is the most popular plan. You can start for 11 bucks a month, but then it will go up, I think, to 22. I am also sure that this thing will change a little bit. And you can also see what you get here. So for this 11 bucks a month at the start, you get professional voice cloning. You have projects, you have audio native, and you have higher quality. And with this pro plan, you get even a little bit more. So these are basically the plans, and you can also start two months for free if you use the annual subscription. So you can play with this a little bit for yourself if you want. But the next thing that I want to show you is the VoiceOver studio. The VoiceOver studio is also really, really cool. Right now it's in better. And also here you need to upgrade your plan. And this guy here explains you every single thing what the voiceover studio can do. Basically, also here, you can make whole projects, you can upload videos and make voiceovers natively with 11 labs. This works also really great. I have tested this out a few times. You can generate speech and sound effects in one editor. You can import video directly, layer your audio tracks, and you have precision in editing these. So this is basically video editing with audio that comes natively out of 11 labs. This works great. Then you have the bugging studio. Here, they also have some resources, so I don't want to spend a long time with these. I have also generated a few things here. If you simply press Create NU Dup, you can simply give your project a name. Then you give the source language and the language that you want to translate it in, and then you can upload your track either from YouTube TikTok or other stuff you can also do manually, and then you can create these things. This will cost you 3,000 credits. I have right now at this minute 55,000 credits left for this month, so I would be able to do this a lot of times. This is also something that I really, really like that I really love because you can translate your videos really fast. And of course, they can tell you a little bit in more detail if you want to. Because I think there's no point that I show you every single step, the same steps as they show you. Basically, create a new step, upload your stuff, and you are ready to rock. You can recreate your stuff in other languages. And the coolest thing is here, yes, that you can do this also in these basic plans, so you can translate videos easily. Then you have audio native. And also audio native is really cool. And also here you need a stronger plan. Basically, what you can do is you can simply use a code snippet, copy the code snippet on your webpage, and then you will have on your webpage such a bar, and this bar will read out your entire webpage. I for myself, I do not have a web page, but if I would have a webpage, I think I would include this. If I would publish articles all the time, you can use these things and then the people that come to your webpage can simply press on this button and 11 labs will read the article out loud in front of them. Logic will get you from A to B. Imagination will take basically, they have this bar, and this bar will read your entire website for them. Even the New York Times has included this and a lot of other web pages. If you go on an article from the New York Times, you see this right here. Listen to this article. You can simply press on this, and then basically 11 labs will read out this article for your loud. I am not sure if I can play this here because like it's the New York Times. And the last thing down here is the voice isolator. If you press on the voice isolator, you can simply drag and drop an audio le that has not good quality and you can make it a lot better. Demo video shows you perfectly how this works. And these audio files can be big up to 500 megabyte. Mm action. Need to remove background noise from your video. Use our new voice isolator model for crystal clear audio every time. So you see this works perfect. If you have noisy voices, if you have a lot of background stuff going on, you can upload your audio generations, and this will get a lot better. And these things can be really big with 500 megabytes, and you will get crystal clear outputs here. Here, you always see how much you can create. In total, I have 60,000 credits a month. Right now, I have 55,000 credit left. Then you have some notifications. If there's something special going on. Then the next thing that you can do is, of course, that you can press on your name, and you have a lot of other things here. You have your profile, and if you press on it like you see some informations, then you can press on API keys. If you are a developer, you can generate API keys and you can make applications with 11 labs. Next, the subscription, here you can manage your subscription. The payouts, if you are an affiliate, and if you are not an affiliate, you can press on become an affiliate. Here you can get up to 22% in commissions, and I have to tell you, yes, I am an affiliate of this program because I use it myself and I love it. And I think I have made like roughly 100 bucks with these because I have published one or two videos about this. Then the usage analysis, if you want to dive deeper they have a whole documentation. If you are a developer, you can simply see for yourself. So the documentation, then the change lock, the help center, the affiliate program, so a little bit more about this program and the AI speech classifier. And lastly, of course, the terms of privacy. Yes, you are able to use this commercially, but you are maybe not able to create voices from different people where you do not have the agreement to use their voices. And lastly, of course, you can sign out. If you want to become an affiliate, because I see it all the time, people ask me these. You just have to contact the affiliate team, you press here, you type in your information, and then you get a link that you can promote. You will get such a link. I think I did this over partner stack, so this would be my link. Maybe I include it in the last lecture. And if you want to make a subscription in 11 Labs, can also include this link, and then you can support me. And you can, of course, also do the same thing. You can simply make yourself such an aflat link. You can place it in videos on social media or wherever, and maybe you can earn even the same amount that you pay for this student, and it's basically for free. So in this video, you learned how 11 Labs works. Generally speaking, it's one, at least in my mind, it's one of the best AI tools if you want to generate speech from text. And you should totally try this out. 20. Transcribing with Whisper: Let's talk about the whisper. Whisper is the free open source tool from Openi and you can even run it locally. You can make speech into text. You can make transcripts. If you scroll down, you see how the technology is working, you can dive deeper if you want. And here you get the whole setup. So if you want to install this locally, here you get this full setup. You need to bip install Open May whisper. Then you need to bip install this right here. Then the upgrades and so on. And then you can basically use it. Now, if you do not want to do this, you have a lot of other options. The easiest option is probably inochio. And if you simply download this thing and unzip it on your PC, you will get an interface that looks something like this. And here you can also type in, for example, whisper, and if you press on it, you can simply download. Pinocchio makes it really, really easy, and if things are not installed, you can simply press install, and then these things will work completely automatic, so you do not have to worry about anything. This thing will work automatically. If you go on the platform from OpmeaI, of course, you can use Wisper also in PyTon so you can make API calls. And it's also really easy to use. You can simply use this right here, and we will make API calls. To whisper, so you can either use it locally for free, or you can integrate it in your own projects with PyTN. And WispA is also really cheap over the API. If we scroll down once again in this article, you see that WispA costs you 0.006/minute. Oh, yes, this is really cheap. If you upload a few minutes, it's nearly for free. In the meantime, Wisper also got installed locally, and here you get your gradio web interface. Here on Open WebUI, you can simply use Wisper and it's really, really easy. You can use whatever you want. You can press on these. Normally Version Large two works fine. Then you go on automatic detection, or you can also use a language that you want. You can type in English or whatever it is. And then you can simply drag and drop here your file. I just want to make an example with something from this course. So I uploaded my file, and then I press generate subtitle file. Here we initialize the model, then we will get this output. And this is also basically a video. So you see this is a MP 44 Aflx video, and this should also work. If you use MP three, of course, it goes faster. And there we have it. You see this Do right now 3 minutes. Of course, this was running locally, and this is a video, and the video is also relatively long. Now I can simply press on these and I can download my file. And now I opened up here my text file, and here you see I have my text file, and I have also the timestamps. So what I am telling in what timestamp. This is completely awesome, and you can work with these. So in this video, you have sown how you can use whisper. You can transcribe whatever you want in no time whatsoever. And this is really, really cheap. And if you want to run this locally completely for free, you can also do this. It's really that easy. 21. Generating AI Music with Udio: Next thing is, of course, that we can even make music. Because you can make text, you can make sound effects. You can also make music. I hope you understand this diffusion models are big. One of the best tools right now at this minute is Udio and Udio has also introduced version 1.5. If you simply press on these, you can also see how this works, and here I can simply show you one or two generations that I have made. If you simply press play right here, mosquitoes bustle around. Big. You hear that this thing is working. You can also always hear the things that are stuff picked. So they think this music right here is cool. Let's just play this one for a brief moment. Partnership Ste. You're right from the east to the west, from the north to the south. So you see this sounds really, really good, at least right now. This thing works really good. Of course, you can also upgrade your plan if you press on it, but you can also start for free, but then you are limited. And if you want to use more, of course, you need to pay a little bit. And you can save a little bit if you pay annually. Just the same stuff as always. But you can start out completely for free, and it's really easy to use if you simply press Create. Here you get an interface. This interface always changes just a tiny bit, and you will always get new options and so on. Basically, you can type in what you want to see. You can get suggestions. You can make it longer up to 130 seconds with one single generation. You can add your own lyrics. You can do a lot of stuff here. Now I want to show you the easiest way to create a song with these. We can simply type in what we want to have, and of course, we need to log in. So just log in with Google with discard or with Twitter. I will go on with Google. I have already made some songs in this tool. And now we simply type in what we want to have, for example, a song about a rabbit. And then we can also do a lot of different stuff. We can use the manual mode. If you start out, just use the default settings. I'm also no expert in music. So if you use the manual mode, of course, you can do a lot of stuff. You can do different tags. So should it be a rock, electronic, pop, chess or something, I think electronic would be cool with our rabbit song. Then the lyrics, do you want to have custom lyrics? So if you press some custom lyrics, you can type them in or they will be automatic. Of course, if you include this manual stuff, you can always type in the stuff that you like. Then the instrumental, how should the instrumental be? Do you want to include something or not? And then the auto generated, if you want to do everything automatically. Just for now, I exclude this right here, and we simply use here, for example, electronic and Electro as our text. And we simply press Create, and then we will wait like one or 2 minutes and we get our song. The song is 1 minute long, and after that, we can also remix the song. Let's just wait until we have our song. And there we have it. We have our two songs. It took about 7 minutes to create them, and let's just see how they are. We are midnight house. Let's go. Let's go. Go. Moonlight glows. First leg. Here's back. Watch the bunny flow. Hop skip, Acrobat. Watch the bunny flow, then Bunny beads. Hello with those bunny feet. Round, hop, round, jump h with those bunny feet. This is awesome, so you can play all day long with this tool. Now we can do the following three things. We can remix them. We can extend them or we can publish them. If you press on mix, you can do here a lot of different stuff. Of course, you can change the text, for example, you can change the instrumental, you can change the stuff that is out generated, and of course, also the variants. You can make it more different or less different. You can remix however you want. If you think it's cool, but you want to have it longer, you simply press on extend. If you press publish, you can share it with everybody on this platform. If you press on these free dots, you can remix, extend, like you know. You can view the track, you can add it to a playlist. You can share it, download, delete or report the song if something is not okay. I think I press extend because I really like this, but you don't have to listen to the whole song. I think the best thing that you can do is to play a little bit with this tool. Udio is, right now, at least in my mind, hands down the best tool. Udio brings music that we can really listen to. We can create and listen to music in a few minutes. This was never, ever possible. Just think about what you need to do to create a song in this quality without A. You need to learn to play instruments. You need to learn to sing, or you need to find the right people. You have to go to a studio. You have to record it, you have to edit it. This is enormous. Now we can make our own music with a few clicks and the music, at least in my mind, is nearly as good as music from professionals. Remember, this is the worst version that you will ever play with. Audio will also get better and better and maybe a new tool comes around the corner that is as good as the top artists on the planet. AI is just awesome. Just play with the stool and let me know if you'll love it. I know you will. 22. Recap and THANK YOU!: Congratulations. You did it. And first of all, thank you. You have learned AI as fast as possible. We started with the basics. So what is I and what are LLMs, how they are trained and how they work? This was a little bit of theory, but you need to understand this because you need to understand that to get good outputs, you need good inputs, and you need to understand tokens for. We started what LLMs are out there and how we can use them. We have a lot. We have closed source Lams like hachPD, clot, Gemini, and much more. But basically, these are the big three, and then we have open source ams. And the open source LL ams, we can use them either on Olama in LM Studio or also on hugging chat. Then you have learned what these LLMs. You can make small text bigger or big text smaller. And with all of this, you can do a lot because you can also make code. You can make text for marketing. You can write antire books. You can write emails, and you can do a lot more. Then we talked about prompt engineering. We have the role prompting, the short prompting, structured prompts, and some tips like think step by step. The most important thing is semantic association, so you need to give context. You can also customize your LLM either with the system prompt or with direct technology. And of course, you can use all of these LLMs via an API and you can integrate them in your own projects if you are a developer. Of course, there's a lot more. There are endless AI tools like perplexity, something that works cool for some, and if you want to play also the hugging chat is cool. Then we talked about divusion models. We started with the picture generation. Divusion models are models trained on text and pictures, and they can recreate pictures if you type in text. Also here you need to be specific to get specific outputs. So prompt engineering is important, and it works in every single diffusion model the same. Just think about what matters. You saw all the most important things about mid journey, a Dogram, adobVaFly and even the open source models like stable diffusion in focus or flux and recraft on replicate. Then you have learned that di fusion models can do more because you can also create audio, video, and voices. Some of the most popular tools for videos are ling, runway and Beca. If you want to generate text, 11 labs or a five DDS and the OMI API is great, if you want to create songs, I think dio right now is the best tool. Also sooner works and eventually also 11 labs in the future. Besides that, you can also use WispR open source for transcriptions. Just install Binochio and you can make transcriptions really easy and for free. So basically, you have learned a lot, and I want to tell you once again what learning is. Learning is same circumstances, but different behavior. Maybe you did not know that AI can do so much things. Right now, you know it, so you should totally do this. This is the most important thing. Use AI tools only then you have learned. And I want to tell you what really good learners do. They learn together because more people always know more than people. So if you could share this course, this would really mean the world to me. Maybe it also means the word to the other person, and if the other person gets value out of this course, they will describe the value to you because you have told them. Thank you for that, and I'll see you, of course, once again in this course or in another course. And one last time, thank you from the bottom of my heart because you have gave me your most valuable asset, your time. Everybody on this earth has just limited time and you decided to spend your time with me. So thank you for that, and you have learned AI as fast as possible.