Artificial Intelligence and Generative AI for Absolute Beginners | Idan Gabrieli | Skillshare
Search

Playback Speed


1.0x


  • 0.5x
  • 0.75x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 1.75x
  • 2x

Artificial Intelligence and Generative AI for Absolute Beginners

teacher avatar Idan Gabrieli, Online Teacher | Cloud, Data, AI

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

    • 1.

      S01L01 Welcome

      2:02

    • 2.

      S02L01 Introduction

      1:36

    • 3.

      S02L02 AI

      7:49

    • 4.

      S02L03 ML

      4:40

    • 5.

      S02L04 DL

      3:08

    • 6.

      S02L05 Gen AI

      3:56

    • 7.

      S02L06 Summary

      2:18

    • 8.

      S03L01 Introduction

      2:48

    • 9.

      S03L02 The ML Box

      4:05

    • 10.

      S03L03 Typical ML Tasks

      5:35

    • 11.

      S03L04 Training Phase

      4:53

    • 12.

      S03L05 Y=F(X)

      5:46

    • 13.

      S03L06 Data Types

      4:50

    • 14.

      S03L07 Features

      4:41

    • 15.

      S03L08 Supervisor

      8:48

    • 16.

      S03L09 Summary

      5:38

    • 17.

      S04L01 Introduction

      2:26

    • 18.

      S04L02 Artificial Neural Networks

      2:57

    • 19.

      S04L03 Deep Learning Architectures

      7:51

    • 20.

      S04L04 Foundation Models

      4:03

    • 21.

      S04L05 Large Language Models (LLMs)

      5:23

    • 22.

      S04L06 Model Types

      4:07

    • 23.

      S04L07 Prompt and Tokens

      4:43

    • 24.

      S04L08 Total Tokens and Context Window

      4:29

    • 25.

      S04L09 Next Token Please!

      4:05

    • 26.

      S04L10 Self Supervised Learning

      6:54

    • 27.

      S04L11 Improving and Adapting LLMs

      8:21

    • 28.

      S04L12 Summary

      6:53

    • 29.

      S05L01 Introduction

      2:02

    • 30.

      S05L02 Prompt Sensitivity

      2:35

    • 31.

      S05L03 Knowledge Cutoff

      3:53

    • 32.

      S05L04 It is not Deterministic

      3:51

    • 33.

      S05L05 Structured Data

      3:01

    • 34.

      S05L06 Hallucinations

      2:13

    • 35.

      S05L07 Lack of Common Sense

      1:58

    • 36.

      S05L08 Bias and Fairness

      3:17

    • 37.

      S05L09 Data Privacy, Security, and Misuse

      2:05

    • 38.

      S05L10 Summary

      3:52

    • 39.

      S06L01 Introduction

      1:37

    • 40.

      S06L02 Text Image Video Audio Generation

      5:53

    • 41.

      S06L03 Web Based vs Application Based

      3:32

    • 42.

      S06L04 Use Case Brainstorm Assistant

      3:41

    • 43.

      S06L05 Use Case Summarization

      2:31

    • 44.

      S06L06 Use Case – Text Enhancement

      2:20

    • 45.

      S06L07 Use Case Code Generation

      7:25

    • 46.

      S06L08 Use Case – Content as a Framework

      2:52

    • 47.

      S06L09 Use Case – Images on Demand

      2:42

    • 48.

      S06L10 Use Case – Boosting AI Based Apps

      2:38

    • 49.

      S06L11 Best Practices for Prompts

      7:19

    • 50.

      S06L12 Summary

      4:28

    • 51.

      S07L01 Let's Recap

      8:59

    • 52.

      S07L02 Thank You!

      1:21

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

18

Students

1

Projects

About This Class

Embrace the Future Today

Welcome to the world of Generative AI, where machines push the boundaries of creativity and intelligence like never before. No longer just a buzzword, Generative AI is transforming industries and unlocking endless possibilities. From art and entertainment to healthcare and finance, its influence is shaping the future. This course is your starting point to explore the core concepts driving this groundbreaking technology.

Simplifying the Complex

AI can seem daunting, but it doesn’t have to be. This course simplifies advanced AI topics, breaking them down into digestible, easy-to-follow lessons. Whether you're a student, a tech enthusiast, a business professional, or just curious, you’ll find valuable insights that make AI approachable and exciting.

Lay the Groundwork for Success

Generative AI is evolving rapidly, offering limitless opportunities. This course gives you the essential knowledge to thrive in an AI-powered future. You'll delve into the key concepts behind machine learning, understand how Generative AI works, and explore real-world applications in various industries. We’ll also cover challenges, limitations, and the ethical considerations involved in this cutting-edge field.

Be Part of the Generative AI Movement

Ready to dive in? Join me on this exciting journey as we explore the fascinating and ever-expanding world of Generative AI!

Meet Your Teacher

Teacher Profile Image

Idan Gabrieli

Online Teacher | Cloud, Data, AI

Teacher
Level: Beginner

Class Ratings

Expectations Met?
    Exceeded!
  • 0%
  • Yes
  • 0%
  • Somewhat
  • 0%
  • Not really
  • 0%

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. S01L01 Welcome: Hi, and welcome to this training about generative AI. My name is Ian and I will be your teacher. I'm super excited that you are willing to explore with me the evolving topic of artificial intelligence and specifically the exploding subtopic of generative AI. Generative AI is a disruptive technology that is reshaping the landscape of many industries while changing how individuals, teams, and companies perform a variety of tasks. It is being rapidly adopted as a set of tools, services, and components that can help to enhance efficiency, creativity and innovation. The impact is huge, and we are just at the early stage. This training is designed for anyone who would like to better understand the key principles of this technology from a theoretical perspective. No previous knowledge is needed. We are planning to explore the main pieces of the AI puzzle, the key terms of machine learning, and deep planning, and then zoom in on generative AI. Reveal the secrets of this technology step by step. My main target is to spark your imagination and willingness to explore those interesting topics. I will do my best to keep it simple, fun and interesting. Thanks for watching and I hope to see you inside. 2. S02L01 Introduction: Hi and welcome. Thanks for joining this training about generative AI. I'm excited to start this journey with you. Generative AI is a super interesting topic, gaining momentum everywhere. As the name suggests, it's part of a larger topic called AI, artificial intelligence. In this introduction section, I plan to identify and map the main pieces of the AI puzzle so we can see the big picture before drilling down to other topics. We'll start by better defining the concept of AI. What is AI? How did we get here and where are we going moving forward? Those are interesting questions. Then we will add the flavor of machine learning and deep learning to AI and discuss what is the big deal about those technologies. The last step will be to define the next AI evolution wave with the introduction of generative AI. It's going to be an interesting and fun story. I will keep it a high level so we can better understand the big pieces of the AI puzzle. That's our starting point, seeing the next lecture. 3. S02L02 AI: Hi and welcome. Let's start with the most basic question. What is AI, artificial intelligence? Think about it for a second or two. The answer is not straightforward, and if we ask five different people, we may get five different answers. AI five years ago was not the AI of today, and it's not going to be the same five years from now. Things are constantly changing in the AI landscape. In addition, AI is a general purpose technology. It means that AI is useful for many things. It can be used to optimize the process of discovering a new medicine. Teach a robot to play a game, make virtual conversation, brainstorm IDs, write content like a blog, generate new pictures, identify objects in pictures, predict a stock price, enhance military equipment, and the list is long. It's a general purpose technology like electricity that powers endless type of machines. In that case, we need a more high level definition that can survive a longer period. Let's try to define AI. Do you know what is the most sophisticated machine on planet? I guess you know. It's you. I'm talking about your brain. The human brain is a complex machine that can digest data from multiple data sources, store and then retrieve it while making fast decisions. Your brain can learn, adapt, and create new things. It is an amazing complex organic machine, highly efficient and very fast. And to be completely humble with mother nature, humans are still trying to figure out how the brain works. It's still a mystery. AI has always been compared to the human brain. The human brain is still considered to be a good benchmark for intelligence until some more smart alien race will take over, hopefully not very soon. In that case, it makes sense to try building machines that can somehow mimic human intelligence. Your mind can scan a picture in a few seconds and quickly identify objects in that picture. It's a complex cognitive task. Trying to mimic such complex cognitive functions as identify an object in a picture, recognize a human voice, understand the meaning of text and many other very complex tasks, it is commonly described as artificial intelligence. Those tasks are complex. That's the definition of artificial intelligence. The practice of getting machines to mimic human intelligence to perform different tasks. And if I will summarize that in a simple sentence, AI is the human desire to create a digital brain, the simulation of human intelligence in machines that can think, learn, and perform tasks almost like humans, and in some cases maybe better than humans. Are we there yet? Meaning, do we have machines that can think and learn like humans? No. Are we making progress in that direction? Yes, we are. AI is already embedded in our daily lives as companies are using AI for many products and services. AI is the technology layer for automation while handling a growing number of tasks that were previously performed only by humans. Are we going to reach the scary breaking point that AI will be better than a human brain? Maybe I don't know. There are many tasks that AI is already better than the average person. For example, it will be impossible for me to learn 20 different languages in a short period of time. Or to summarize a complete book in a few seconds, something that AI can do. However, my set of generic capabilities to handle a variety of very different tasks is still very hard to achieve using a more generic AI. I can drive my car in a variety of road conditions, I cook different types of meals, clean my house, play with my kids, and I have three kids, scuba dive with friends, and many more things. I can easily learn new task and adapt as needed. Can we reach that point, meaning, develop a generic AI machine that can do multiple tasks like a human being? Maybe it's hard to predict. I assume that as part of the ongoing progress, we will be able to see AI solutions that can handle a group of tasks that are part of the same domain or expertise. Think about the analogy of you standing 100 meters from your friend. You are allowed to walk half the remaining distance on every step. Okay? So step number one, you will go 50 meters. Step number two, 25 meters, number three, 12.5 meters, and so on. Are you going to reach your friends? Well, no. You will get very close but not be able to reach your friend. Maybe AI is like an unreachable frontier that is hard to reach. At this point, no one knows, but there is a constant progress in the industry. The race is on. Until a few years ago, AI was not able to digest a complex languages and generate new content. It was part of a good science fiction movie. All that changed in 2022 when CHGPT was introduced by Open AI and change everything. We'll talk about it in the next sections. Another important question, is that good or bad? What do you think? Feel free to share your thoughts in the course comments. Some people will say that AI is a very dangerous technology, and they are right. Like any technology, it will be used for bettings like cracking into our bank accounts while faking our identity. On the other end, it can be used for great goals like speeding up the development of a new medicine and save lives. It will be utilized in different ways. One thing to consider is that automation is a core use case of AI. It's a general purpose technology that can power many different use cases, and therefore, it will have a dramatic impact on many industries. A going number of tasks and processes that were performed by humans are going to be automated. We're in a major evolution period, and it's hard to predict what will happen five or ten years from now. I can assume that many jobs will disappear or the demand for them will be reduced, and many new jobs will be created. That's part of the game, and we should be able to adapt. All right, I think we have a more solid, high level definition of AI. In our next lecture, let's zoom in and talk about the next piece of the AI puzzle, which is machine learning. Thanks for watching and see you next. 4. S02L03 ML: Hi and welcome back. In the previous lecture, we talked about AI and the desire of humans to create machines that are smarter, better, more powerful using the benchmark of the human brain, creating artificial intelligence. That's a perfect scary story for a good science fiction movie. Right? And you know what, humans have been doing that from day one. Every breakthrough over the years in computer hardware to make them smaller, faster and cheaper is another step in that direction. Computers can run more efficiently, digest more data, store more data, and process data faster. But it is not just hardware. It's also the evolving progress in software engineering. Think about a group of NASA developers that created a complex software program that can run a spaceship. Calculate the precise location in space, traveling at an amazing speed, automatically perform the required adjustments to the engine and other system, and then land it on another planet. I'm still amazed by those space projects. The level of automation is amazing. As you can imagine, it is highly complex set of tasks that can be automated by software. Those developers need to put all the knowledge and all the rules inside the spaceship program to handle different situation. So if a software program is highly sophisticated, can do many things, can manage a spaceship, can monitor millions of sensors, is that good enough to call it AI, artificial intelligence? Maybe it's a subjective very broad definition. However, any sophisticated software created before the AI wave was missing something. It was missing the basic capability to learn and improve, which is a foundation of human intelligence. We can try things and learn from our experience. Our brain is constantly changing. Our knowledge is not fixed, it is evolving. So to machine learning. Machine learning is a subfield of AI. It is the missing component that helped AI to improve almost exponentially, making significant progress while trying to match human intelligence. The main concept of machine learning is to provide machines with the ability to learn things without specifically being programmed about those things. As you can imagine, it was a major mind shift in software programming. Instead of building software that is preprogrammed with a huge number of tools and knowledge, let's create a system that can digest and learn pattern from the data and make decisions based on those patterns. Machine learning is based on many scary things like algorithms, statistical analysis, training models and consuming huge amounts of data using complex computing infrastructure. It is a combination of multiple components working together to digest data, learn patterns, get better, and eventually do something useful. Later as part of the training, we will open the machine learning box and explore the main component running inside. As a quick summary, AI is the umbrella term for trying to create or mimic human intelligence for performing a variety of tasks. Machine learning is a subfield of AI that brings the ability to learn patterns from data. Machine learning is a very broad field with many different methods that are used to handle different scenarios. Some of those methods are highly focused to solve specific tasks like predicting the future price of real estate property or to classify an object in a picture. As part of the evolution of machine learning, some of those machine learning methods are using highly sophisticated algorithms, Sill to deep learning, that's the next piece of the AI puzzle, see in the next lecture. 5. S02L04 DL: Welcome back. We just talked about machine learning as the subfield of AI. Machine learning is also a broad branch of different methods. Those methods developed over the years by a variety of scientists and engineers helping to push the boundaries of machine learning. A key dimension of machine learning is related to the complexity of patterns inside the data. The complexity of patterns related to different tasks is sitting on a scale some tasks are simple, and some of them are very complex. For example, predicting the price of a real estate apartment based on historical data and using around 100 different parameters like the apartment size, number of rooms, location, and so on, does not require learning complex patterns using machine learning. We can use simple algorithms to learn those patterns. On the other end, teaching a machine learning solution, the human language will require more complex algorithms, more time to train a system, and the ability to process and store very complex patterns. That's the job of deep learning. Deep learning is a subset of machine learning. It is called deep learning because it can be used to learn very complex and deep patterns from a large amount of data. It uses something that is called artificial neural networks that are inspired by the human brain. Information in our brain is processed by a complex network of interconnected nodes that are placed in different layers. The same concept is used in deep learning, meaning the ability to create many layers in a complex network. The depth is directly related to the number of layers. More layers means that it can digest more complex data and identify more complex patterns. As part of the training, we'll talk about the structure of artificial neural networks at a high level to better understand this approach. Let's review the AI puzzle so far. AI is the umbrella term for all methods and technologies that enable machines to mimic, simulate or replicate human intelligence. Machine learning is a subfield of AI, focusing on different algorithms that enable machines to learn patterns from data. Deep learning is a sub field of machine learning, adding the ability to handle large amounts of data and learn more complex deep patterns. Now we are ready to talk about another piece of the AI puzzle, meaning generative AI. See you next. 6. S02L05 Gen AI: Hi, and welcome. What is generative AI? Finally, we reached the main topic of this training. I will use some perspective based on my experience. Between 2000 2021, I created a training course about machine learning and it was a great opportunity to explore the main concept of AI. Machine learning was evolving in multiple directions with many interesting market use cases. During that time, generative AI was not a major topic. The famous Chachi PT tool that kicked off this new branch of AI was only introduced in November 2022. Now, while I'm sharing that with you? Well, it was an amazing major pivot point in AI, taking us in a new direction. Before that time, most of the market solutions and use cases of AI were highly focused on specific tasks, one task per one AI solution. On the other end, think about the human brain. AI is constantly compared to the human brain. We can digest information using our eyes while scanning light. We can hear things by analyzing sound wave, smell things or taste food. The input is highly complex and our brain can handle many different tasks. Another dimension that humans developed over the years is the ability to communicate. The text we read or hear can describe many things from simple questions to highly complex description of processes, solutions, knowledge, insight, and more. We constantly exchange information using our language. Generative AI added the important capability to analyze text as a language, and that's a major shift in the AI industry, breaking the communication barrier between humans and machines. Now, machines can digest, analyze, and understand complex text as input like direct questions, requirements, instruction, and more. Another super interesting things that NAI added is the amazing capability to create new content. It's called generative because it can be used to generate new content. New content like text, an image, a video, a sound wave, and more. All of that is based on patterns learned from data. It's taking us a step further who are trying to mimic human creativity and intelligence. As you can imagine, it opens a new frontier of a growing number of business use cases and a huge opportunity for companies and also for individuals to harness this evolving technology. One of the highest potentials and low hanging foods of generative VI is the ability to uplift and boost productivity. Generative VI can be used to automate, augment and accelerate work in many directions. We'll talk about that during the training. Suddenly, creativity, which is a fundamental human intelligence is also possible with machine learning using generative AI. The sky is the limit. Like AI, generative AI is also a general purpose technology that can be used across many different domains. We are still at the early stage of the evolution of this technology. It has the potential to impact a wide range of industries and application. All right, that's a quick overview of generative AI. Let's summarize everything in the next lecture. 7. S02L06 Summary: Welcome back. This section was a high level introduction to the AI landscape. We talked about the definition of each term and how those pieces are aligned with each other, starting with the umbrella term AI, moving to machine learning, deep learning, and finally, generative AI. We started by defining AI artificial intelligence as the human desire to create a digital brain, a brain that can mimic human intelligence. So machines can perform more and more complex tasks. AI is a general purpose technology that can be used almost anywhere. Machine learning algorithms were able to boost AI into new frontiers by adding the capabilities to learn patterns from data. Machine learning is a collection of different algorithms and methods. One of them is deep learning. Deep learning managed to take us much deeper into the ocean to explore new things, handle more complex patterns, and as a result, improve the ability of machine learning solutions to handle more complex tasks. Then in 2022, generative AI added the important capability to analyze text as a language, breaking the communication barrier between humans and machines and providing the capability to generate new content. That's our story so far, and we just started. In the next section, we will start to overview the main building blocks in market terms that are related to machine learning. Machine learning is the foundation of generative AI, and we must create some basic understanding around a couple of key terms. You are welcome to test your knowledge and understanding by answering the quiz at the end of this section just after this lecture. Thanks again for watching, and I hope to see you in the next section. H 8. S03L01 Introduction: Hi, and welcome back. I would like to share with you something. I like to watch science fiction movies. I mean high quality science fiction movies that explore new interesting ideas and topics that are beyond the reach of human knowledge and capabilities. One of them is related to the interesting balance point between humans and machines. Humans have been using machines for thousands of years, making them better, faster and smarter. We can't even imagine our lives without using machines as they are embedded everywhere. Assuming this constant improvement will continue in the future, the balance point between humans and machines will eventually start to favor machines. Machines will be more powerful to perform tasks that require human intelligence. That's an interesting idea explored by many movies. Now, which technology is the foundation to make those machines mimic human intelligence? Machine learning. That's the scut engine inside any AI based solution. For people who just started to explore the concept of machine learning, it will be like climbing a very high mountain without a map and without equipment. It's hard to figure out where and how to start. It is an intimidating topic. And we should do something about it, right? That's the objective of this section. It is called soft introduction to machine learning. We are going to talk about the main building blocks, technologies, and market terms related to machine learning. It is an important step before moving to the main topic, meaning generative AI. Overall, just to set the expectation, machine learning is a very complex topic, and I'm putting a lot of emphasise on simplifying some of the terms. Maybe you already have some background and knowledge about some of those topics, which is great. I still suggest reviewing the complete section. It will help to establish a unified background around machine learning terminology. All right, let's slowly open the magical machine learning box step by step. See you next. M 9. S03L02 The ML Box: At the most basic level, we can take any machine learning solution out there and simplify its main function using the analogy of a box. This simplified illustrations enable us to slowly dive into this topic. This machine learning box will have two sides, input and output. Input is a collection of data to be analyzed by the machine learning box to create the output on the other side. The box is closed because at this point, we don't care what's going inside. It is doing something, hopefully something useful. Now, what kind of data the ML box should get and what will be the output? Well, it completely depends on the task or use case. Let's take a few examples. Our first ML box, number one, is used to classify if a product is defective or not during the production process. At some point as part of the production line, a camera takes a picture of the product and fits that picture into our machine learning box. The box will process that picture. Identify all kind of patterns related to the product, product size, shape, all kinds of indicators, and the output will be a perfect product or a defective product. It is a typical classification job of a machine learning solution. Classify if the input data is X or Y next, machine learning box number two is used to classify if a product review provided by a customer using a website is positive or negative. This time, the input for our machine learning box is a text with sentences written by the customer, while he or she are writing the review, and the output of the ML box will be positive or negative. Again, this is a classification exercise. Another machine learning box, number three, is used to generate an animated video based on a story. The input this time is a text that describes a scenario or a story line and the output is a video, like an MP four format, video file. As you can imagine, there are many different machine learning boxes that are used to handle a variety of tasks. Those are just a couple of examples. Now, these illustrations using the L box is not just useful for this training. In many cases, an ML solution is embedded as a small component in a much larger software application. It is part of a process. Data is flowing in from another component and the output of the machine learning component is going to another component as an input, like a chain. Another very popular example is related to consuming machine learning as a service from another company. Instead of building and maintaining the machine learning box, I'm paying another company to provide me with the options to feed an input and get an output. In the software space or in the software terminology, it is called APIs. There are a growing number of companies that offer a variety of APIs to consume different machine learning services. Someone else is building the machine learning box and then provide pipes to interact with that box. What kind of tasks machine learning boxes can perform? That's the topic of the next lecture. See you again. 10. S03L03 Typical ML Tasks: Hi, and welcome. We have a simple illustration of a machine learning solution. Using a box, the box has input and output, nothing scary, right? I also provided a couple of examples using those machine learning boxes. Now, I would like to make it a little bit more generic and talk about the typical tasks that machine learning solutions are being used. We can divide those typical tasks into the following four categories. Prediction, classification, clustering and content generation. Let's review them one by one. Starting with prediction, predicting something is a key and practical use case of machine learning. It refers to the process where a machine learning solution is trained to predict the value of a target variable based on input data. Let's mention a couple of examples. Predicting the future prices of stocks based on historical prices or economic indicators or recent news, estimating future sales of a product by analyzing previous sales data, marketing efforts and seasonal trends. Using historical weather data to predict future conditions like temperature or wind speed, predicting which customers are likely to leave a service based on their past behavior and interactions. Those are examples related to predictions. The next category of typical tasks while using machine learning is classification. Classification is a fundamental use case of machine learning where the goal is to assign input data into predefined categories or classes. A simple example is spam detection. Classifying emails as either spam or not spam, based on features like the content of the email, sender information and subject line, all kinds of pieces related to the email. This type is called binary classification as there are only two classes, spam not spam. It can be used for sentiment analysis, analyzing texts like customer reviews or social media posts to classify if the sentiment is positive, negative. Another popular example is image classification, categorizing image based on objects inside the picture, such as is that a dog, cat, house, a car. As we have more than two classes, it is called multiclass classifications. Moving next to clustering. Clustering is a bit different approach compared to prediction and classification. The task is to discover hidden structures or patterns within the data. The output is not a predefined number like in prediction or some category label, like in classification. We don't know what is the expected output. The goal is to group data points together into clusters based on internal characteristics or patterns. A cluster is a group of similar data points that are closer to each other compared to points in other clusters. Let's mention a couple of examples how clustering is being used. For example, in e commerce, clustering can be used to group similar products based on user preferences, allowing for personalized product recommendation. Or grouping customers based on their behavior, preferences, or maybe demographics to target marketing efforts more effectively. In social network clustering can reveal a group of user who interact frequently or share common interests. Recommendation system can use clustering for knowledge clustering, like grouping documents, articles or stories that cover similar topics. And the final most recent category of machine learning task is content generation. Generation in machine learning refers to the creation of new content that did not exist before based on patterns learned from data, producing outputs such as text, images, music, code, and more. We'll see many examples during the training. Those are the four categories of typical tasks performed by machine learning boxes, prediction, classification, clustering, and content generation. I would like to emphasize that those categories are not competing with each other. Each category has a tremendous unique value in a variety of use cases. All right, we have an ML box with input and output, and we now understand what kind of task such box can do. It is time to talk about how those boxes are trained to perform their job. That's the topic in the next lecture. 11. S03L04 Training Phase: Hi and welcome back. Any successful Olympic player has thousands of hard and sweaty training hours before reaching the level of expertise to compete with the top talented people worldwide. We can summarize that with a sentence, no pain, no gain. Without a big surprise, an ML box is like an Olympic player. It must be trained. Before using an ML box to perform something useful, we need to make sure that it has the required knowledge to digest the data and identify the relevant patterns for our specific use case. This phase is called the training phase, and it is a fundamental concept in machine learning. The training phase is used to train the machine learning box as much as possible with enough amount of data, so it will be able to accurately predict, classify, cluster, or generate content based on the input. And like a good Olympic player, this training process can take substantial time and resources. The knowledge inside the machine learning box created during the training phase is called a trained model. A model is also a fundamental term that we will use a lot when talking about machine learning. That's the final output of the training phase. Then the machine learning box can use this trained model to do to do something useful. This training phase is probably the most challenging step when building a machine learning box. We need to collect enough amount of data. In some cases, we need to prepare the data like cleaning arrows before filling it into the system. In other cases, we need to manually decide which data elements are more relevant. Then we need to select the most suitable algorithms that will be used to analyze the data. There is a long list of very scary mat algorithms that can be used. The selection of the most relevant algorithm will be based on the data characteristics to process and the required job to handle. For example, to perform the job of classifying an object in a picture, the best matching algorithm is the planning with a neural network. It is also important to measure somehow the level of accuracy of the trained model using performance metrics to make sure we're not getting stupid results. It is usually an iterative process, meaning we adjust something, re train a new model, measure performance, and do it all over again until reaching the required performance benchmark. As you may guess, it is a complex, sensitive process performed by a skilled team of AI engineers. Those AI teams are using different tools, frameworks, and computing resources to build and fine tune those trained models. It can take hours, days, weeks, and even months. It's all about the complexity of the model. The final model will be copied as a snapshot into the machine learning box running in production. This step is called inference. It is like taking our Olympic players and letting them compete. That's the knowledge used by the machine learning box to digest the input, put some magic, and generate the output. Now, let me ask you something. Do you think that after the Olympic competition, our Olympic players will stop their training? Of course not. That's also true for a trained model running in production. It must be re trained at some repeated intervals to make it more optimized to recent data and recent events. It is like a cloud cycle. All right, we cover a couple of terms. The training phase, which is generating a trained model using algorithms. The model must be validated using performance metrics to make sure it is operating as expected. Finally, the trained model is used by the machine learning box in production to do something useful. But you may ask yourself, what is this trained model created during the training phase? How the training model can be illustrated? Let's talk about it in the next lecture, see you next. 12. S03L05 Y=F(X): I will start this lecture by saying the obvious. Oh, my God. Most of us are running away from any formula like birds flying away from a fire in a forest. That's fine and understandable. Math is something we have learned a long time ago and we are trying to forget it. However, a little bit of very simple math can sometimes be useful to organize and simplify complex things. I know it may sound strange, but stay with me for a couple of minutes. You remember that we managed to squeeze any machine learning solution in a box, which is great because at this point of time, I just want to know how to interact with that box. I have two pipes, input to insert something and output to get something. An ML box with an input and output is basically some kind of data transformation that can be presented as a generic simple math formula. X is the input data. It can be text and image, audio, numbers, et cetera. Why is the output generated by the machine learning model? Like the input X, the output can be also different types of content. F is the machine learning trained model. It is the formula or function used to take the input data X and map it into the output of the model, which is Y. It is a data transformation process. This simple formula shows that any machine learning model is basically a mathematical transformation function between input and output discovered by the algorithm during the training process, taking input X and applying some function F to produce output Y. If I didn't scare you yet, that's great. Let's review an example. If our ML box is about predicting the price of a real estate apartment, most probably the formula will look like something similar to this simple structure. This type of function is gated by an algorithm called linear regression in that specific use case, and the function is a linear function. By the way, I invented this formula and those specific numbers, it is just for demonstration. But what is the meaning of those x1x2, et cetera? X one can be the number of rooms in the apartment. X two, the square size, X three, distance from nearby school, X four, distance for a nearby hospital, et cetera. Those are the list of features that are provided as input X. The absolute numbers, meaning 0.1, three, 0.4, are the model weights, also called parameters that were estimated by the algorithm during the training phase. Doing training, the model was exposed to many examples of apartments and learned what are the parameters and their impact on the price. For example, we can see that the number of rooms, meaning X one, has a dramatic impact on the price of the real estate apartment compared to the distance from school. It has a bigger weight. So using this formula, which represent the model, I can take any combination of input parameters and predict the output Y, which is the price of a specific apartment. That's a simple example of machine learning, even if it's not a complex use case. Let's take a look at a more complex example, like an ML box that can be used to identify an object in a picture. It's a common to use deep learning to train a model that will be based on a few layers of neural networks. How can I use a simple math to present a deep learning model? Well, very simple. The following MT formula represent a four layer neural network. F one is the first layer. It is getting X as input, like the raw image file to analyze. The output of F one is propagated as input to F two, which is the next layer, and so on until reaching the last layer, F four, and then we get the final output Y, which can be the label of the identifier object in that picture. As you can see, any trained model is a mathematical transformation taking input and mapping it into some output. As part of the training phase, the job of the algorithm is to consume a lot of data and use it to generate this mapping function F to be used as the final trained model. It is an optimization exercise optimizing the F mapping function step by step while using the training data. The training phase is all about data, and it's important to understand what kind of data types are available and how to handle them with the right machine learning box. That's our next topic. See you again. 13. S03L06 Data Types: In previous lectures, I mentioned that the trained model in machine learning is created by an algorithm or a group of algorithms using data as input. The final ML box is using the trained model for getting data as input and generating data as output. It's all about data. I think it's important to understand the concept of data, what kind of data those machine learning boxes can handle. There are three main types of data. The first one is called structure data. Structure data is a data that has a defined format, and it is highly organized and arranged in a predefined structure. For example, a list of customers on a website will be handled and organized in a structured format, including all kind of information about the customer, like name, age, address, phone number, email, identification number, and more. We can present that list of customers as a simple tabula view, like a spreadsheet or inside a database table. Each row represents a single customer, and each column is a specific piece of information about that customer, like the email on name and so on. Structure data is organized and consistent, making it easily searchable and accessible by humans and computers. It's easy to open a spreadsheet and quickly find the relevant customer, and then the relevant piece of information about that customer. It's very organized. On the other end, structured data has a low flexibility. For example, if I want to add another piece of data to the list of customer, another attribute that described a customer, it will require substantial work to make that adaptation across different models. On the other side of the spectrum, we have unstructured data. Unstructured data is information that does not have a predefined data structure or is not organized in a predefined manner. It's the opposite of structured data. Let's take about a few examples related to unstructured data. Emails, which is text data is an example of unstructured data, social media post, a war document, PDF, books, articles, blogs. It can be also multimedia data like images, video, audio files. Unstructured data is considered more complex and more challenging to process. Think about analyzing a text of an email. In that case, finding patterns in unstructured data required to use more advanced machine learning methods, like using deep learning, we'll see it during the training. And the last one is sitting between them, semi structured data. Semi structured data is a hybrid type of data located between structured and unstructured data. It has a partial structure. An email is a simple example of semi structured data. It has a combination of structure and unstructured data. The structured parts includes the fields in the email like the sender, the date, and subject of the email. On the other end, the content of the email can contain unstructured data, like free text or attachments. Another example is about log files created by system when they measure things like sensor measuring temperature. Or error events when something is not working in a system. Such log files may have semi structured data where certain information is organized in a consistent format. Like every event will have a timestamp, severity level, event ID. But the content of the log message can be free text, which is unstructured data. Perfect. We talked about the three main types of data. Those data types can be the input or output of a machine learning box. Later in this training, we will see how different methods are used to handle specific data types. Let's zoom in a little bit more and talk about the concept of features in machine learning. 14. S03L07 Features: Welcome back. We just talked about the main types of data, structured, unstructured and semi structured. I also mentioned that the type of data we would like to feed a machine learning box has a direct influence on the methods that will be used to train the model inside that box. When feeding data into a machine learning box, the data will be divided into more digestible pieces. Those pieces are called features. Features are the input variables used to train the model and later on also make prediction classification, clustering or content generation. It is like taking the data stream and slicing that to more meaningful pieces. So what is a feature? A feature is an individual measurable properties or characteristic of a data being analyzed. A feature can be related to unstructured data or structured data. Let's talk about a few examples. Numerical value is an example of structured data like at the age of a person. It's a feature. Categorical data, like the agenda or color, date and time, text, like quotes, phrases, or topics. Again, this is example of features that are unstructured data. Images, meaning a pixel value, textures or patterns extracted from the image. Audio, data like the spectrum, pitch, and other audio characteristics. Let's assume that our machine learning box is about predicting the likelihood or the risk of developing a specific disease. In that case, the features could be, for example, the age of the person, which is a numerical, geo location, which is categorical feature, gender, again, categorical feature, blood pressure, numerical, smoking status is a smoker, non smoker, health condition, and those are just examples. These features provide the model with the necessary input to learn and make predictions about the target variable, such as whether a person is likely to develop a specific disease. Features are crucial for the model's learning process, as they represent the input data used to predict the target variable. The quality and relevance of features can significantly impact the performance of a machine learning model. For example, if I drop the age value as an input feature for our machine learning box, it may not be able to accurately predict if that person will develop a specific disease. In some cases, the engineers responsible for training a machine learning model may decide to remove, transform code or combine different features. It's called features Engineering. Feature Engineering is the process of creating new features or transforming existing ones, which is a crucial step in many machine learning projects. For example, taking the age feature and transforming it into a specific age group. So instead of using the raw age value like one, two, three, four, and so on, we can beam the ages into categories like group one will be 0-18. Group two will be 19-30 and so on. This can help to capture relationships between age and disease risk, especially if the risk change significantly at a certain age thresholds. By carefully selecting features, engineers can improve the accuracy and effectiveness of machine learning models. All right, we talked about the ML box and typical ML tasks. Then we reviewed the training phase, the training model, different data types, and also features. It's time to talk about the teacher who is supervising the training process. See you next. 15. S03L08 Supervisor: Hi and welcome back. In our last lectures, we talked about the training phase to create a trained model for an ML box that will use that trained model to do something. But how that training phase is done and which methods are being used to train a model. Let's talk about those questions. In machine learning, there are a couple of methods that can be used. We have supervised learning, unsupervised learning, semi supervised learning, and the last one is reinforcement learning. Under each method, there are multiple machine learning algorithms that can be used. The selection of the most relevant method and the best algorithms to perform the job will be based on the data type and required objective. For the first three options, we can see that it is somehow related to the supervision level, how much external human intervention is needed to supervise and control the process of training a model. The first and probably most popular method to train machine learning models is called supervised learning. Supervised learning is called supervised because it is guided by labeled data. Let me explain that concept of labeled data. Using this method, an algorithm is training a model using a specific pre selected dataset. This dataset is called the training dataset. The training dataset is labeled data. A label data is basically a collection of many data examples. Each example is a pair of the input and the expected output. Now, because the expected output is provided, it is called labeled data. We know what is the input and we know what is the output. The labels provide the model with the correct answer foreach input, acting as a supervisor or teacher doing the learning process. Think of it like a student learning a new subject with a teacher. The teacher provides the student with exercises and their correct answers, guiding the student's understanding. In the same manner in supervised learning, the label data is used to guide the model to better understand the relationship between inputs and outputs. As a simple example, a training dataset can be a collection of ten K images, and per each image, the label will be a text that describe the main object in a specific image, like a house, dog, cow, cat, bike, et cetera. Each example is a pair of input and output. The input is the image and the output is the label. Looking at the following diagram, we have a training dataset with many examples, all those images. Each example has input and expected output. We fed the model with the first example, the first image, meaning input data X. The model predicts the output Y and then compares the predicted output X with the expected output. So for example, if I feed an image with a cat as the main object, and the algorithm identify that as a dog, then something is not working. If they are not the same, it means that the model should be tuned a little bit because there is an arrow with the prediction. The algorithm will adjust the model parameters and then try again, predict again and check the arrow again, trying to reduce the arrow to a minimum. That's called optimization. It's a repeated process while digesting a large number of examples. On the other side of the spectrum to train machine learning model, we have unsupervised learning. With unsupervised learning, the model is trained using unlabeled data. That's a major difference. While supervised learning focus on training models with labeled data, unsupervised learning trains model on unlabeled data. The model learns patterns directly from the data without any guidance or targets. The main goal when using unsupervised learning is pattern discovery. We can take a large amount of raw data and fit it into a machine learning box using unsupervised learning, and it will try to discover hidden patterns, structures, or relationship within the data itself. Those patterns can be used to create useful insights. It's like exploring a new territory without a map, aiming to uncover hidden places. You don't know what you're going to find. A classic example using unsupervised learning is clustering. We talked about clustering as a category of task in machine learning. Checking if there are data points that naturally falls into different clusters, different groups. That's about unsupervised learning. One of the biggest challenges when training a model with supervised learning is to get enough labeled data. Otherwise, the model will not perform so well. Sometimes getting enough labeled data is expensive or a time consuming process, and we have easier access to unlabeled data. This is the situation in many projects. In that scenario, a third method was developed to bridge that gap. It's called semi supervised learning. Semi supervised learning is hybrid approach that combines elements of supervised and unsupervised learning. It involves training a model on a dataset that contains both labeled and unlabeled examples. It utilizes a small amount of labeled data and a large amount of unlabeled data to train a model. And therefore, it's a very cost effective option to improve the accuracy, and efficiency of machine learning models. The last method to train machine learning models is called reinforcement learning. It is a completely different approach compared to the three methods we covered so far. It is inspired by how humans and animals learn through trial and error. Where actions that lead to positive outcomes are reinforced, while those that lead to negative outcomes are discouraged. Using this method, we train something that is called an l agent, reinforcement learning agent by interacting with the environment to maximizing something that is called a reward signal. The agent is basically a decision making machine. It is constantly making small decisions or actions by trying things and learning from experience. Imagine that you play a table tennis with a robot that is controlled by such aryl agent. The Al agent gets a positive reward each time it wins a ping pong session. It may play very badly when starting a new game and then improve while learning how to play better. It will learn which actions and what kind of strategies can lead to better positive outcomes. It is very similar to how we learn from experience. Over time, the RL agent develops a policy also called a strategy for selecting actions to maximizing the reward. That's the model created by RL. Reinforcement learning is used in many fields in transportation while developing self driving cars that can navigate roads and make decisions in robotics for teaching robots to perform tasks in complex environments. Or in the gaming industry while developing agents that can play a variety of games. By the way, some use cases are using a combination of those four options to train models. Let's summarize everything we've covered so far in this section and see the next lecture. 16. S03L09 Summary: Hi and welcome to the last lecture in this section. I would like to summarize all the things we covered so far. We started by using the concept of a machine learning box as a simplified representation of any machine learning solution. This box takes input data, process it, and produce an output. The type of data and the desired output depends on the specific use case you would like to handle. Next, we talked about the four main categories of tasks by machine learning solutions, prediction, classification, clustering and content generation. Prediction involves forecasting future values based on past data. While classification is used to assign input data into predefined categories like labels. Clustering is used to group similar data points to find hidden patterns inside, and content generation creates new content based on learned patterns. How does the MLBx know to perform a specific task? Well, based on training. Doing training, MLBx learns from data to acquire the knowledge needed for performing different tasks. This process involves collecting data, repairing it, selecting algorithms, and then evaluates the model performance while doing the training. The final trained model is then used by the ML box in production to make something useful. It is a complex process that requires time, resources and expertise. We also saw how any machine learning solution can be represented as a simple mathematical formula while equal to F X. X is the input data, Y is the output, and F is the trained model. During the training process, the algorithms tune the mapping function F using a large amount of data. Can we push any type of data to a machine learning box? Well, no. We need to understand the data types to better match the right solution. In that context, we mentioned the three main types of data in machine learning, structured, unstructured and semi structured. Structured data has a defined format. Unstructured data lacks any defined format, and semi structured data falls between the two. Examples of structured data includes a list and tables, while examples of unstructured data includes the text, images and videos. Semi stature data is a partial stuture and we mentioned emails and log files. Now, data is a very high level definition. We need to make it more granular and divide the data into more manageable pieces. That's the concept of features. Features are the individual pieces of data that are used to train a model and later on make prediction or something else. They can be numerical, categorical, text, images or audio. As part of improving a model performance, it is a good practice to perform something that is called feature engineering, meaning add remove or transform specific features. Lastly, we cover the different methods used to train machine learning models, including supervised, unsupervised semi supervised and risk enforcement learning. Supervised learning uses labeled data to train model for prediction classification and data generation, while unsupervised learning uses unlabeled data for pattern discovery. Semi supervised learning combines both labeled and unlabeled data and can be used as a cost effective option to handle the situation that there is not enough a labeled data. And the last one was about reinforcement learning, which is based on trial and error, a learning method and can be used to handle very complex scenarios that the box need to interact with the environment. The choice of method or a combination of methods depends on the datatype and the desired objective. There are increasing number of models that utilize the combination of those learning methods. They can use supervised learning together with unsupervised learning and even use reinforcement learning. And this is absolutely the direction of many use cases. That's it for this section. As a soft introduction to machine learning, you're more than welcome to test your understanding with a quiz following this lecture. In our next section, we are planning to talk about the secrets of generative AI. See you next. 17. S04L01 Introduction: Hi, and welcome. Thanks for watching so far. Did you ever see a live magic show on stage? I guess you managed to see a couple of such shows. It's a great experience. You are watching every step that the magician is taking and saying to yourself that you're going to get it. You're going to reveal how he or she is doing that magic show. But unfortunately, in most cases, the magician is doing that performance so well that you are just surprised, amazed with a big smile on your face. That's the job of a great magician. Going back to our topic, the capabilities of Genea to understand human language and to create content based on text seems almost magical. Like our magician. It's quite amazing that we reached a point where those system can understand complex text as input and generate many types of content. It's a big step forward in the AI landscape. There is a tremendous growing list of tasks that can now be handled by GAI based systems. Maybe ten or 20 years from now, it will be so embedded in our daily lives that it will not be so amazing anymore. We get used to it like any technology. But how those generative AI systems doing their job? What is the cicut engine running inside? It is a good and important question, even if most of us will not build those systems, we are going to use them and for using them wisely, it will be useful to better understand the key principles of those technologies. That's the main objective of this section. We will uncover the secrets related to GAI. You don't need any background in math, computing science or programming. Just bring a nice coffee tea, and let's start our journey. See you in the next lecture. 18. S04L02 Artificial Neural Networks: The first building block of GNAI is artificial neural networks created using deep learning. Artificial neural networks or in short, ANN, are computer based models inspired by the biological neural networks found in human brain. Don't worry, we are not going to make it too complex. Let's look at the high level structure of an artificial neural network. We can see three layers, input, hidden and output. It's a very simple illustration. On the left side, the input layer receives the input data, which can be numbers, images, text, or any other form of information. The input data is divided into features like X one, X two, X three, et cetera. A simple example, if the task is to predict the price of an apartment as we saw before, then the input can be a collection of features like the overall size, number of rooms, location, and more. If the task is to classify an object in a picture, then the input will be a collection of pixels of that image. Then we have the hidden layers inside. These layers process the input data and extract relevant sub features. They can be based on multiple layers, allowing the network to process more complex patterns. That's why it's called deep learning. The depth is correlated with a number of those hidden layers. The last layer is the output layer that produces the final output, which can be a classification, prediction or generation of new data. We can also see many lines of connection between nodes inside those layers. Each connection between nodes has a certain weight that is adjusted during training to optimize the performance of the network. For example, if the overall size of an apartment is a critical factor in predicting the price as output, then it will have a stronger weight number. The algorithm used to train that model adjusted the size of each connection. If the input feature is important, then it will have a bigger and stronger connection with greater influence on the output. More important features will be translated into stronger signals propagating inside the network with a greater influence on the output. That's a high level definition and illustration of an artificial neural network as part of the evolution of deep planning, multiple internal architecture or deep planning architecture were developed to handle different types of data and different types of tasks. That's the topic of the next lecture. 19. S04L03 Deep Learning Architectures: Part of the introduction to machine learning, we talked about the concept of training a model. The model represents the knowledge of the machine learning box. When the patterns inside the data are very complex, the typical method will be to use deep learning, so the trained model will be based on multiple layers inside the neural network. Those layers will be able to catch more complex patterns. That's the main concept of deep learning. As part of the evolution of training a model using deep learning, three main types of architecture were developed. Let's quickly present each one of them and what is the important relation to generative AI? Using the first method, meaning recurrent neural networks, the machine learning system is processing the input data sequentially, meaning one data element at a time, each processed element is adding some tiny knowledge and changing the internal state of the train model. It is slowly capturing connections and patterns between data elements. As you can imagine, it will take a lot of time to process all data elements because we are doing it sequentially one by one. It's not a good solution for huge amount of data. Still, it was one of the main methods to train models with the planning for all kinds of tasks related to processing languages. The next type of neural network architecture that evolved is called convolutional neural networks. It is specifically designed for image and video processing task like image classification, object detection, image segmentation, and so on. In around 2017, 2018, a team at Google developed a new architecture to process input data. It is called the Transformer architecture, which was a key breakthrough in the planning. It is now the de facto standard for training many models with the planning. Using this architecture, an ML system can process input data in parallel instead of processing data sequentially like in gene. The parallel processing is done during the training phase, as well as later on when the train model is ready and being used in production. This approach is more efficient helping to speed up the training phase. By the way, this training phase is based on a famous hardware component called GPU. GPU stand for graphics processing unit, and for many years, they have been a great solution for running amazing graphics. Every computer is equipped with a GPU chip. Another key use case that evolved in the market is for training deep learning models. Those GPUs are the power host for parallel processing using the transformer architecture. All Cloud providers like Amazon AWS and Google Cloud are buying those GPUs to provide the cloud infrastructure for training models at scale. Parallel processing is the first key advantage when using the transformer architecture. Let's talk about the second key advantage. Do you remember using markers in your notebook while analyzing some text? I use that a lot during my engineering studies. My strong hand is my left hand, and I was always slower than the rest of the class when writing. Writing fast and clear is not my strong side. At some point in time, during my studies, I decided to change my learning strategy and listen without writing anything, even a single word. At the end of the class after a couple of lectures, I just use the copy machine to copy from someone else with much better handwriting capabilities. As you can imagine, my best tools were markers. I used the sophisticated at color method for marking keywords. Many people are using that simple method because it helps to emphasize in a visual way which words or group of words in the sentence are more important. Marking key terms to remember. Your mind can focus on those key terms and map the connection to other words. The transformer architecture uses a similar approach when analyzing input data to help the system better understand which data elements as part of the input are more important than others. As a big surprise, it is called the attention mechanism, helping the system to pay more attention to specific elements as part of the data stream, like a marker putting more emphasis on a specific keyword or a pair of words while using the red color and another marker with a green color. Let's take a simple example. If I will provide the following sentence to a GNAI system, how to create a Python code that can calculate the sum of two numbers. The AI system will break down the sentence into individual words. This process is called Tchanization, creating small tokens. We'll talk about tokens in the next elections. Then it calculates something that is called attention scores for each pair of words, trying to determine how much attention should be paid to one word when processing another word. This helps the AI system to better understand context. For example, the atension mechanism might determine that calculate as a word is closely related to sum because calculating a sum is a co action. Or PyTon will have a high attention to code. Okay? Since PyTon described the language used to write the code and so on. So as a quick summary, using the transformer architecture, the parallel processing, coupled with the attention mechanism enables GNAI system to digest more data, process it faster and catch more complex patterns. It is the backbone of generative AI. Now, why it is important? Well, the power of generative AI is based on the capabilities to digest and understand the human language. In machine learning terminology, it is called natural language processing. Before using the transformer architecture, it was very difficult to generate a good machine learning model that could handle the human language. Human language is highly complex. It is unstructured data. The text is context. One word inside a sentence or a sentence inside a paragraph influence the meaning of the next word or the next sentence. This architecture is helping to train very complex models that can handle the human language. They are called LLMs. Large language models. Those models are now the foundation of many Genea use cases. Now, as you can imagine, training an LLM model requires huge amount of data and huge amount of computing resources. Who can build and train those large language models. So to the concept of foundation models, our topic in the next lecture. 20. S04L04 Foundation Models: Hi and welcome back. In the previous lecture, we talked about the evolution of deep learning methods. With the introduction of the transformer architecture, before using the transformer architecture, most of the models were more simple, specific purpose AI solutions, meaning the train model was great for handling a specific task. This architecture created the required environment to train models with much more data and make them more generic. It was a huge step moving forward. That's the concept of foundation models. Foundation models are large scale more generic models trained on massive amounts of data and can be adapted to perform a wide range of tasks. Therefore, they provide a strong foundation that can be adapted to a large number of use cases. It's a great building block. Think about a library of thousands of different signs, books, or historical books. Each book is used as an input to train a specific foundation model. The result will be a foundation model that has knowledge on a variety of topics, millions of topics. Training a foundation model is an expensive resource intensive project. We need the ability to collect, store, and process huge amounts of data. We need the hardware and software infrastructure and a team with the relevant skill set. That's where the big players can leverage their power. Big players like Google, Microsoft, Amazon, and others. They can train large foundation models using huge amounts of data. One of the most popular foundation model is GPT, running inside the famous HAGPT service introduced by Open AI. GPT stands for generative pre trained transformer. Let's break down those words because we already cover those terms. The first one is generative, it means that the model can generate content based on the input. This is the main task of that model, generating content. Next word is pre trained. This model was trained on a large amount of data from diverse sources, such as website, books and articles. It is a foundation model. And the last one transformer. That's the internal architecture of the model, which is becoming the de facto architecture for creating foundation models. ChaGPT is a great example of using a foundation model designed to be accessed by anyone. User can directly interact with the model using a simple text as a pump, ask a question and get an answer. Given this satility of a foundation model, smaller players like medium sized companies or startups can leverage those foundation models developed and provided by the big players. Instead of investing millions of dollars in training such models from scratch, they can adapt an existing foundation model for a fraction of that amount and introduce new AI based products more quickly. As I mentioned, a foundation model is a building block. Now there are many types of foundation models. So are focused on handling a natural language processing. Some of them are focused on computer vision tasks like image and video generation, speech recognition and more. One of the co foundation model types is for natural language processing. They are called LLMs, and that's a topic of the next lecture. 21. S04L05 Large Language Models (LLMs): Hi and welcome. In the previous lecture, we talked about foundation models. A foundation model is a model that was trained with a large amount of data coming from different data sources. It can be used as the building block of foundation for other more tailored models. One of the most popular types of foundation models is the large language model. Large language models are the core capabilities of generative AI to handle text as input and output. That's one of the main engines running inside in any GAI solution. They are widely known for their amazing ability to analyze, understand, and generate high quality written text as a response. Using those models, machines can understand, and respond in a native human language. And they are getting better and better. The introduction of LLMs is a huge step in the AI industry with tremendous potential to impact almost any domain. In the previous lecture, I mentioned the famous HGPT in the context of a foundation model. CHEPT can serve as a base for many types of application and tasks. The type of model that HAGPT is based on is LLM. Another important aspect to mention is that not all LMS are created equal. The word large can be misleading and we need to explore that a little bit. LLM is a generic term for training large language models. But what is the meaning of large? How one LLM is larger than another LLM. One way to estimate the power of an LM model is to look at the model size. A model size typically refers to the number of parameters in the model. It's like the number of brain cells. Parameters are the small elements or variables adjusted during training so the model can process input data and generate output. That's the knowledge of the new model. More parameters are directly correlated to more capacity for stowing knowledge. The number of parameters in a typical LM is measured today in billions, like 1 billion, 10 billion, 100 billion, and more. Today, a model size of around 100,200 billion is considered to be a large model with great capabilities. The model size is not always information that is published by the company that train that model because those players are competing. And they may decide not to share that information. This is something to consider. In addition, we need to keep in mind that this industry is in a constant evolution. 100 billion of parameters today may seem small five years from now when the models will be measured, I don't know, maybe with 100 trillions of parameters. Those numbers are not written in stone. They will eventually change and the benchmark to be considered large is going to be higher. Let's talk about the famous TAGPT that is based on GPT model. GPT has evolving version, one, two, three, four, et cetera. As you can imagine, each version is bigger. They are using more data to train the model. There are more parameters in that model. However, it's not a free meal. A bigger, more complex model will require more computing resources to train and deploy. The complexity of the model will influence many factors like the required data set and volumes to train the model, the number of computing resources that will impact the cost of cloud resources that's running that model. Also the skill set to keep it running. Secondly, a bigger model isn't always better for a specific use case. As an example, if I would like to build or use an off the shelf model that can recommend the best restaurant at a given location based on an input data. The user can write some text about the required restaurant. It will provide some recommendation. In that case, I don't need a model that was trained on the complete human history. That knowledge is nice but less relevant to that specific task. I need a smaller, more cost effective model that is more tuned to handle that specific task. As I mentioned before, training foundation models is the job of big players. They have the resources to create such models with billions of parameters. Other market players will focus on what kind of tasks they would like to automate in their workflow using AI and then check which models in the market are cost effective for those tasks. In the next lecture, I would like to talk about the main categories of LLMs. See you next. 22. S04L06 Model Types: I welcome. There are many types of foundation models, and if I will be more specific, there are many types of LLMs. And it will be useful to categorize them based on several dimensions. The first dimension is general purpose LLMs versus domain specific LLMs. As the name suggests, general purpose LMS handle a wide range of tasks. They will be trained by taking massive amount of data from the Internet and other data sources. The outcome will be a large language model that can handle a variety of topics. Chachi PT and Google Gemini are general purpose LLMs. The second one is domain specific LLMs, also called specialized LLMs. Those LLMs are trained to handle tasks related to a specific domain like finance, legal, cybersecurity, gaming, medical, and countless other type of domains. All the data for training a domain specific LLM will be related to that domain. Those LLMs can be further fine tuned to handle more a niche area of a specific domain. I assume we will see more and more companies that are training domain specific LLMs. The next important dimension is open source versus closed source LLMs. Open source models are models that are available to the public and can be used without any commercial cost. It's like an open source code. Anyone can download a model, change and customize it. Like any open source project, it is not always tuned for production, but it can be used as a development framework, a starting point for proof of concept project, and more. One big advantage is that we will have full control over the model. We can inspect the code and underlying architecture, customize it as needed, and most importantly, there is no license to use it. Another advantage is related to data privacy. In some cases, an organization cannot upload sensitive data to a third party service so by using an open source model, which is used internally, sensitive data will be used with a low risk of data leakage. On the other end, closed source cell lens are proprieties models owned by companies and are not available to the public as source code. They are available to the public as web based services or using APIs. It is a model which is encapsulated in a box. All we need is to provide the input and then get the output. The company that developed a closed source LLMs will provide some level of free access with limited capabilities, and on top of that, more premium options based on a pricing model. They are monetizing the trained LLMs. CAGPT and Google Jiminy are examples of services based on a closed source LLM. A closed so CLLM will be more optimized for production because someone else is investing resources to ensure it is working as expected. Secondly, it will be much faster to deploy. This approach simplifies the integration of AI into many applications. With a few lines of code, developers can integrate advanced AI capabilities into their application and without dealing with the whole concept of training models and keep them updated. Now we are reaching the interesting question about LLM. How does it work? What are the magical elements inside that LLM model? Well, you may be surprised that it's not as complicated as you may think. See you in the next lecture. 23. S04L07 Prompt and Tokens: Hi, and welcome back. We covered in a high level the concept of LLMs, large language models. The main capability of a large language model is analyzing, as well as generating a text. In addition, we divided LLMs into several categories like general purpose versus domain purpose and open source versus closed source. In this lecture, I would like to take a step back and talk about the input of an LLM, meaning prompts, and how a prompt is bloken into little chunks. It's another building block while we keep revealing the secrets of generative AI. Let's start with the questions. What is a prompt? A prompt is a piece of text that is given to the LLM as input. It can be used to control the output of the model in a variety of ways. The output of the model is known as completion or response. When we provide a prompt to the model, it generates a response based on that prompt. The prompt is a group of words, sentences and paragraphs. Let's consider the English language as an example. It has more than 1 million words, including different forms of the same word, technical terms, slang, compound words and more. It is a complex language that is evolving all the time. A typical natural language will have huge number of possible words, creating a complex combination of words and sentences. As a result, the complex text format is not the most efficient way to process and store data for a machine learning system. It should be simplified somehow. The solution for that challenge is to convert parts of words, complete words, and combination of words into numbers, numerical data. Those numbers are called tokens. In that case, the model handles tokens which are numbers instead of dealing with words. So what is a token? A token is a fundamental term in generative VI. Tokens are essentially numerical representations of characters, words or phrases. Tokens refers to units of text that the model process tokens can be a single character like the letter B, or a complete word like a flower, or a combination of words like ice cream. By representing words as numerical numbers, using tokens, the model can perform operations on them more quickly and efficiently. The set of full tokens used by the model is called the vocabulary of the model and the process of splitting text into token is called tokenization. The component that is performing that process is called a tokenizer. In essence, an LLM is getting a sequence of tokens created by a tokenizer, breaking the input text to tokens. Let's see that in action. I will Open AI tokenizer website. I hope they will keep the link available to the public so you can play with that as well. This tool can be used to see how a specific model will break any prompt into tokens. I will paste the show text as a prompt inside and see how it is converted into tokens. This counter shows the total amount of tokens. Each token is coloured with a different color, so we'll be able to see the flow of identified tokens in the text. Just keep in mind that the list of tokens for a specific given pump text may be divided differently depending on the model. More advanced models may use a different tokenizer. You can say that tokens are part of an internal process inside the AI system, so why should we care about them? That's a good question. Let's answer that one in the next lecture. 24. S04L08 Total Tokens and Context Window: Hi and welcome. We just talked about the concept of breaking and translating text input prompts into tokens. Tokens are numerical representation of characters, words or phrases. It is an internal process inside a generative AI system, and the question is why should we care about it? Let's tackle that question. Tokens are a fundamental metric for measuring usage in GAI system. The total number of tokens for a given input, plus the total number of tokens generated as part of the output is a measurement used by companies to track and limit the usage of generative AI services. As an example, assuming a company developed an application that is powered by a third party GNAI service, in that case, they will optimize the input prompts to minimize cost. Usage is eventually translated into cost and cost optimization is important. Another key issue is related to the model limitations. Different models have different token limits, which can affect the length and complexity of the prompts we can use. Let's assume that you are discussing with your friends about some issue. It's a long ping pong session. You are saying something and based on that, your friend is saying something. Both of you are trying to consider the context of the discussion to keep the flow. But if it's a very long session, you or your friend will have some trouble remembering all the things each and every one of you mentioned over, I don't know, the last 3 hours. We have limited capacity in our short term memory, right? Going back to GAI systems, it is called the context window. A context window refers to the maximum number of tokens that a model can process and consider at once as a group when generating a new token. This window is a crucial factor related to the model ability to understand and respond to complex prompts or generate long term as output. A larger context window enables the model to generate more sophisticated and contextually relevant outputs. Imagine that you are feeding a GEI system that has a ten K, a context window and you are feeding an article which is broken into 15 tokens. In that case, the GAI will not be able to digest and process the full article, leading to unexpected behavior of the model. This model can consider looking back up to those ten K tokens. To make an effective use of the context window, it is important to manage how text is presented to the model. For example, long documents may need to be chunked to fit within the context window. In our example, I can break the article, which is 15 tokens to several chapters where each chapter is not larger than the size of the context window. Let's take another example. If we are building applications that interact with generative EI systems through APIs, understanding tokens can help us better design APIs more effectively, like setting appropriate limits on token usage, providing feedback on token consumption, break queries into smaller prompts before we are using those into the actual GAI system, and overall, we can use that to optimize the API performance on our side. As a quick summary, by understanding tokens, we can better utilize a specific GEI system, optimize our usage, and get the most out of those powerful tools. 25. S04L09 Next Token Please!: Another important aspect of tokens is related to the output of the model. We may assume that an LLM can think and understand the meaning of words, but this is not really the case. LLMs may seem very sophisticated, but on a practical level, those are machines that just see a pattern and easy to predict the next token. Let's explain that concept. An LLM model is operating in a sequential mode, each time it will predict one single token as output. Then this predicted token will be used as input to generate the next token in a sentence and so on. It looks like a sentence, but from the model perspective, it is just a sequential list of related tokens that are still under the range limit of the model context window. Using this approach, LLMs can continue to predict a sequence of words that looks like a complete answer. That's the magic behind this technology. An LLM model is a sophisticated machines to generate a sequence of tokens. Another interesting question is about the way a token is generated as output by the model. Well, based on statistics, and to be more precise, it is based on probability distribution. Let's open that concept at a very high level. LLMs are trained on massive datasets, so they have tune parameters on the statistical relationship between tokens. They learned which tokens have stronger statistical correlation to other tokens. By using this information, an LLM can make that prediction with some level of statistical confidence. The model process the sequence of tokens as part of the input and calculates a probability distribution for the next possible token. Let's take a simple example. Assuming the input tokens are the following. The cat set on and a missing token. This is the predicted next token. The model will take the provided tokens like cat and set on, and it will internally generate a list of possible next tokens with a probability distribution, like the next token can be fence, sofa, roof, et cetera. Then it will select the best matching next token, maybe so far with the highest probability. The model will predict the next token based on calculating which list of tokens have a stronger statistical correlation. To make it less deterministic, it will not always select the token with the highest probability. It may add some level of randomness to make sure we do not get the same output for the same input. If you use the solution like Cha GPT or Google Gemini, you may notice that for the same question, you may get slightly different answers while running the same prompt several times. Now, what is the logic to make the model less deterministic? Well, it is useful to add flavor of creativity and to allow for more diverse outputs, simulating some level of creative thinking. That's about how those LLMs are generating text as output. It is a sequential process to generate the next token in some feedback loop. However, I haven't mentioned yet how those LLMs are trained. That's the topic of the next lecture. 26. S04L10 Self Supervised Learning: Hi, and welcome. We covered many topics related to the underlying technologies of generating I and specifically Zoomed on LLMs. We talked about the prompt input and how it is divided into tokens, considering limitation like the context window. We also mentioned that after the training phase, an LLM holds the patterns represented as statistical relationship between characters, between words and between sentences. It's basically using tokens. Using those patterns, the model can predict what comes next. What is the next token or word in a sentence? Then it will use the previous predicted word to predict the next word, one by one sequentially to create more complex structure like sentences and paragraph. It is a sequential process. In this lecture, I would like to uncover another small secret about generative AI. I mentioned that LLMs are created or better say they are trained using massive amounts of unstructured data. The question I would like to tackle is how that possible? What is the method being used? As you can see from the title of this lecture, it is based on a process known as self supervised learning. Let's explain that concept. In supervised learning, the training data is labeled data, meaning each data point example as input and output. The output is the label data and someone should provide those examples with labeled data. In many cases, providing label data is a costly process with many limitation. It's not always possible to get enough amount of labeled data. It is becoming even harder when talking about training models for handling languages. Text is unstructured data. Self supervised learning is an interesting method in machine learning where a model is trained using the data itself in a supervised manner without using external labels. The idea is quite simple. Let's assume we are feeding a model, a single page from a book. This page is based on multiple paragraphs and each paragraph has multiple sentences, right? Each sentence from that page can be used for training that model. How is the model achieving that capability well, based on several methods, one of them is called masked language modeling. It involves the following step. Step number one is called masking. It will take a sentence and then in a random way replace some words in the input text with special tokens called mask. This creates a masked sentence. Step number two, it's called prediction. The next step will be to try predicting that masked part. The model is trained to predict the missing words based on the context provided by the surrounding words in that sentence and then compare the prediction to the actual masked part. Step number three is basically learning by repeatedly performing these poses on a large dataset, the model will learn to understand the relationship between words, their meanings, and how they are used in a context. Let's take a simple example. The input sentence will be, I love eating cheese pizza. The model can take that complete sentence and mask a word inside, like masking the topping. The task will be to predict the missing word. Possible prediction can be pepperoni, cheese, margarita, and so on. So how this is working end to end. The model analyze the surrounding words. I love eating and at the end pizza. Based on the context, the model predicts that the missing word is likely a type of pizza topping. The model might consider the frequency of different pizza topping in its training data, as well as in the context provided by the word eating. Then the model will assign probabilities to different possible topping, such as paparoni cheese, margarita, and so on, then select the most likely topping based on its probability distribution. Finally, it will calculate the error level between the actual masked error and the predicted value. So this error will be considered as a feedback loop to keep optimizing the internal parameters of the model. If I will expose 1,000 similar sentences, and 70% will be with the pizza topping cheese, 20% with margarita and around 10% paperoni, that's going to influence the probability distribution stored in the model. Now, this is just a single sentence, right? Imagine if we trained an LLM model with millions of sentences, billions or trillions of words by repeatedly going through these steps, the model learns to predict missing word with increasing accuracy. All that is done without human supervision. It's a self supervised learning, completely automated. As a result, it has massive scalability. The model can leverage vast amounts of unstructured data and unlabeled data that is available everywhere. This process helps the model develop a deep understanding of the language, including relationship between words, their meanings, and how they are used in a context. In that case, we will get a large model that will be good at generating a text response. That's how an LLM model is becoming so good while generating content based on text input. This pre training step based on self supervised learning and other methods is super important for creating foundation models. But it is not the end of the story. Those models can be tuned when generating responses, and that's the topic of the next lecture. See you again. 27. S04L11 Improving and Adapting LLMs: Hi, and welcome back. In the previous lectures, we covered many topics related to LLMs. A LLM is a foundation model that is pretrained on a large amount of data. In this lecture, I would like to talk about the options to improve and adapt an LLM so it can be tailored and used as a building block for more specific use cases. Let's review them one by one. The first one is called contextual prompting. It is a method as part of pumped engineering. Pumped engineering is the process of designing and crafting prompts to effectively communicate and guide large language models to generate desired outputs. The idea is quite simple. When we ask the AI system to do something, we should articulate as clearly as possible the required task and in some cases provide context, any required background. That's the easiest and most cost effective way to tune the responses we get from the model. In the next section, we'll see how to use contextual prompting. Let's move to the next option. The second method to tailor a model to a specific requirements is called retrieval, argumented generation or in short Rag. It combines the power of pre trained large language models with the ability to retrieve additional relevant information from external knowledge sources like external databases or documents. Why this method is useful and important this method addresses some common problems associated with using public GAI system. The first issue is related to private data. Many companies are holding private data that cannot be exposed to the public and cannot be used by other companies to train foundation models. There is a gap between what a generic public LLM was trained on and specific useful private data owned by a company. The second issue is about limited knowledge about facts and events that took place after the model was trained. This method is based on several steps as part of the process. Get and analyze the input prompt, based on the requirement, extract useful relevant internal data from the organization databases or any other knowledge source and then feed the original prompt together with that extra information to a pre trained model. It's like an enrichment process. For example, a company that has a support chatbot can use the Rug method to enhance the pumped with information about products and services. When a visitor is asking something about a product by submitting a query in a chat session, the chatbot will search internal databases and internal documents for information that will be useful as extra knowledge to the Backend LLL model. It will take the visitor query, original query, add extra information like maybe the product user manual, which is a private data and set it via API to the LLL model. The LLM will be able to generate content based on the original query coupled with the extra data. In our case, the LLL model may find the answer inside the product user manual using this extra information. I would like to emphasize that this method is not changing the LLM. It simply provides additional context and information to help the LLM generate more accurate and informative responses. By using this method, companies can leverage internal private data as additional information to enhance and tailor the generated content. Secondly, it can be used to bring the model up to date with recent event or any domain specific content. One of the significant drawbacks of RAG is the limited size of the context window in most language models. This means that the model can only process and understand a certain amount of text at a time. So the system which is searching the extra data must be careful with the amount of data it is pushing to the LLM as an extra information. Another issue is related to latency. Retrieving information from external sources can introduce latency, delay into the overall end to end generation process, potentially slowing down the response time. Lastly, is the cost of using a large number of tokens as part of the input prompt. We need to submit a substantial amount of extra data for each query. It may not be cost effective for use cases that require a large amount of extra data in a higher frequency of request. Let's talk about the next option to consider. Fine tuning. Did you notice that the training of a large foundation model like LLM is called pre training? It is called that way because it is common to have a two step training process, pre training and fine tuning. The first step called pre training, is to train a foundation model with a massive amount of data. This step is done by the owner of the model or the company operating that service. Then as step number two, other companies or individuals can take that pre trained foundation model and retrain it with specific data. The result will be a new fine tuned model that is more optimized to handle specific task or several tasks. It is called transfer learning, training a smaller task specific model on top of the pre trained propriety model. It's important to emphasize that fine tuning is based on small amount of data compared to the data used for pre training the foundation model. Therefore, it's a very cost effective option in some cases. Why not retraining a model from scratch? Well, I mentioned that this is the job for the big players, in most cases, it's a more cost effective option. Fin tuning is not limited to open source models. There are many models providers which offer dedicated APIs that allow developers to interact with their models. These APIs are used to feed the model with custom data for fine tuning. The result will be a new sub model with updated parameters. There are a couple of benefits when using fine tuning compared to pumped engineering with context or using Rag. First is the latency. Unlike RAG, there is no process time for each query to find the relevant enrichment data. Secondly, the prompt size will be much smaller because we don't need to provide extra data for every prompt, reducing token usage. On the other end, fine tuning is an advanced method that must be done carefully to get the right solution, and it will require some level of expertise and knowledge in that field. Otherwise, the output will be less useful than the original LM. That's about the three main options to tailor and existing LLMs. Thanks for watching so far. Let's summarize the complete section. 28. S04L12 Summary: Hi, and welcome back. Thanks for watching so far. I think we covered a lot of topics in this section. It was quite a comprehensive overview of many key terms in generative AI. I hope you feel that we managed to uncover many secrets related to this technology. Now, I would like to quickly summarize this section and create a connection between the topics. Generative AI is a type of artificial intelligence that focus on creating new content. It added the important capability to analyze text as a language and use it as an input to generate different types of content. And it's not limited just for a text. It can be used for generating other types of content. We started by talking about the main building blocks of any GEI solution, the artificial neural networks created using deep learning. That's the internal structure to hold the brain and knowledge of the model. Next, we talked about the evolution of a couple of deep learning architectures used to train those neural networks. The latest and greatest development is the transformer architecture. This architecture created the environment to train and handle large models and make them more generic. Those models are called foundation models. They are large scale generic models trained on massive amounts of data and can be adapted to perform a wide range of tasks. Therefore, they provide a strong foundation as a building block for many use cases. One of the most popular types of foundation models is the LLM, large language model. LLMs are the core capabilities of generative AI to handle text as input and output, they are used to analyze, understand, and generate high quality written text as a response. Now there are many types of LLMs. We mentioned categories like general purpose LLMs versus domain specific LLMs, general purpose LLMs handle a wide range of tasks as they are trained by taking massive amounts of public data. On the other end, domain specific LLMs are trained to handle tasks related to a specific domain, like finance, legal, et cetera. All the data used for training a domain specific LMS will be related to that domain. Another important dimension is open source versus closed source LLMs. Open source models are models that are available to the public. Anyone can download the model, change and customize it. On the other end, source LLMs are proprietary models owned by companies and are not available to the public as source code. They are available to the public as web based services or via API. Those are closed boxes. Then we covered several key terms that are important when using GAI models. The input size is called the prompt as text, and the output is called a completion or response. When we provide the prompt to the model, it generates a response based on that prompt. Next, we talked about the concept of tokens. Tokens are numbers that are generated by a component called tokenizer, which is used to map the text into a numerical format. It is more efficient way to process data by models. The number of used tokens is measured as a metric for service consumption. In addition, each model will have a limitation on the maximum number of tokens that can be handled as a group under the same context. It is called the context window. Using this knowledge, we managed to reveal how those LLMs are generating a complex text response. It is all about predicting the next token in a sequence. The sequence of generated tokens creates more complex patterns like words, sentences and paragraphs. Each predicted token is selected by looking at the statistical distribution of possible prediction. How those LLMs digest massive amount of unstructured data as part of the pre training phase? It is based on a sophisticated self supervised learning method. Using this method, a model is trained using the data itself in a supervised manner without using external labels. For example, it is done by automatically taking sentences, masking specific words, and trying to guess the right answer and learn from that experience. The last topic was about the three main options to adapt and tailor an existing LLM. The first one is called contextual pumpting. It is a method, the spout of pumped engineering where we provide context, a spout of the pump to better guide the model to generate the output. The second method to tailor a model to specific requirements is called retrieval argumented generation. Or in short Rag, the concept is to extract useful relevant internal data from external databases and fit in the spout of a prompt to a pre trained model. By using this method, companies leverage internal private data as additional information to enhance and tailor the generated content. And the third method is fine tuning where we take a pre trained model and retrain it with new data. It is a practical method to shape an existing LLMs according to specific requirement, optimize to specific domains and use cases, the output will be a fine tuned model. That's a quick summary of the topics we covered in this section. Please use the quiz to test your understanding and feel free to share questions. As you can imagine, generative AI is not a perfect technology, and it has some limitations and challenges that we must consider while leveraging this evolving technology. That's the main topic in the next section. See you again. 29. S05L01 Introduction: Hi, and welcome back. Thanks for watching so far. I hope you enjoy the training. AI and specifically generative AI is an exciting innovative technology. The market is gaining momentum, and we can see more and more companies and individuals trying to leverage that technology in many use cases. That's a direction for the upcoming e, and it's a great opportunity for anyone to join this huge market wave. Now, I don't want to lower your excitement and expectations. Nevertheless, we must be fully aware that NAI is not perfect. It is a new technology with limited market experience, and like any new technology, it has multiple challenges and limitations. Those limitations can create substantial risks in many practical market applications. As a simple example, the output generated by a GAI model may seem smart, sophisticated, and very convincing, but sometimes it is just full of mistakes. Imagine that a company from the finance industry implemented generative AI in their support channel, and it is generating very nice answers, but with mistakes. That's a major issue to consider. Therefore, it is essential for individuals as well as for companies to be aware of those limitations and act with more responsibility when using those technologies. That's the objective of this section. Let's start our exploration. See you next. 30. S05L02 Prompt Sensitivity: Did you ever try to record yourself with a microphone? When tuning the sound amplifier for high sensitivity, it will amplify even a small sound. We need to carefully balance the sensitivity to avoid picking up all kinds of background noises. Secondly, we need to record in a quiet environment to avoid getting echos. The input we feed, the microphone is important. Any content creator knows about those challenges. In GNAI the prompt is the main input to the model, like the microphone when recording our voice. That input is used to understand the requirements and the overall context. Therefore, those models are highly sensitive to the prompts they receive, like an amplifier tuned with high sensitivity. There is a famous sentence related to processing data. I imagine that you heard about it. Garbage in, garbage out. In our context, the quality of the prompt has a major influence on the output. If we fill it with garbage, meaning low quality, less organized input, we can expect to get garbage, meaning low quality output. It is as simple as that. The bottom line means that the model was not able to fully understand our requirements. That's the first challenge when using generative AI. We must pay attention to the quality of the input to maximize the quality of the output. Silo to the concept of pumped engineering. We mentioned that in several lectures, pumped engineering is the practice of crafting and optimizing prompts to effectively interact and guide generative AI models. It's something that when using GEI system, we are learning how to apply all kinds of things to make our pumps much better. This involves crafting the right was a context, instruction to the prompt to achieve specific type of outputs. As part of the unleashed the power of generative section, we will talk about a couple of useful tips related to prompt engineering. 31. S05L03 Knowledge Cutoff: Our next challenge of a GII model is called knowledge cutoff. As the name implies, many models are trained on data that was available up to a certain point in time. This is the date after which the training data used to develop the model does not include new information. It makes sense. We collect the required training data, train a model, and that's it. Any new information will not be part of the model that we already trained. Any data produced after the training phase is not part of the model knowledge. If I train the model on date up to 2024, any data created later on will not be included. As you can assume, this knowledge cutoff will limit the model's ability to provide up to date information, up to date answers. As a simple example, a developer may ask a GEI system to generate a piece of code to solve a specific task in a particular programming language. It will generate that code based on the trained model. However, if a new version of that programming language was released after the model was trained, the code may be less accurate while using things that are not relevant anymore. For use cases that are based on up to date knowledge, this limitation can be a significant issue to consider. Think about the financing industry where up to date information is critical for making the right decisions. So what are the options to handle a knowledge cutoff? There are two main strategies to handle knowledge cutoff. The first one is to update and retrain the model in repeated intervals using the most up to date data. Companies that retrain a large foundation models will just release a new version like Chachi PT version one, two, three, four, et cetera. Still, even by using this option, there will be always a knowledge cutoff between the updated intervals. If I update the model every six months, still under that six months, it will have missing information until it will be retrained. For some application, it may be good enough as a way to mitigate that issue, but in other cases, it will not be good enough. A company that decides to train or tune its customer that has the flexibility to optimize the time interval between updates. On the other end, if we are using a third party solution, it is vital to be aware of the cut off date and what are the updates intervals. The second option to handle knowledge cutoff is to connect the model with external online tools like search engines and database that will close the knowledge gap. With the most recent events and up to date data. As you can imagine, such GAI system are more complex to maintain and will cost more resources. The big players like Google and Microsoft are already using this approach because more and more users are starting to use GNAI as the main door to search online. It's an interesting trend. I assume that somewhere in the future, many models will have a very short update frequency and they will be connected to online tools. It is part of the market competition to deliver better models. It means that the challenge of knowledge cutoff will be slowly mitigated by the industry. Let's move to the next challenge. See you next. 32. S05L04 It is not Deterministic: As part of the introduction section about GenEI, we talked about the concept of making the answers of a GEI model a bit more creative. Less deterministic, trying to simulate human thinking. It's like adding some spice to the answer. It means that the same prompt can produce different responses depending on the model's internal state, and level of randomness configured in that specific model. We may assume that it is a limitation, but it's not. This less deterministic output is achieved by design, not by mistake. Those systems use statistical models to calculate probabilities for a range of possible next word of phrases. As an example, if the input prompt is, what are the best ten ice cream flavors. The model will try to predict it by calculating the probability of popular flavors like vanilla, 95%, Chocol 80%. Chocolate chip 62, cookies and creams 57, and so on. Those probability numbers are not real. I created them to explain that concept. As a vanilla is, for example, 95%, there is a high chance that the model will always select it as the first flavor. Then it will select the next flavor based on probability. In our case, chocolate that has 80%. If we run that same input prompt several times, we may get slightly different list of flavors because the model may choose, for example, strawberry over salted caramel. Even if the probability is a bit less, it is adding some level of randomness to simulate less deterministic answers. In some cases, this less deterministic behavior creates new challenges. Think about the GAI system that is supposed to answer questions about legal and tax issues. Those are very sensitive stuff, and user will expect to see highly professional and consistent answers to the questions. If one day a user will get a specific tax recommendation, and in the next day, he or she will get a different recommendation for the same question, they may not trust the system anymore. On the other hand, if I'm using the AI system to generate in brainstorm IDs for a marketing campaign, it may be useful to get different perspective on the same topic. What is the solution for that challenge? Well, there is a way to control and influence the creativity and randomness of some models. It is called the temperature hyperparameters, measured as a simple float number 0-1. When integrating with a GII system using APIs, the API request can include that specific parameters as a number. If the number is low like 0.2, it is low temperature, meaning the model will generate more deterministic and focused responses. Outputs are more likely to be predictable and consistent. If the number is high like 0.8, it is a high temperature. The model generates more diverse than creative responses. The output is less predictable. Adjusting these parameters, the temperatures allows to influence the level of randomness and creativity in the models responses, helping to tailor the output to better fit specific use cases. 33. S05L05 Structured Data: Hi and welcome. Our next interesting challenge is related to structure data. Structure data is one of the most popular methods to organize information. Think about a simple spreadsheet that aggregates product reviews on a website. There are multiple columns like the product name, category date and time when the review was provided, maybe information about the person that provided that rating like age, gender location. And finally, they provided review score. It can be like a number 1-5 stars. If I want to feed all that information into a GAI system and ask to predict the review score of a specific product based on a person details, it will be a challenge. You will be surprised to hear that a typical GAI model is not the best choice for handling structured data like tabular data in a spreadsheet. The main reason for that limitation is because generative AI models are primarily designed for handling unstructured text data such as sentences and paragraphs. They are trained to capture patterns, context, and semantics in text. On the other end, tabular data is structured in rows and columns with specific schema and relationship between columns and rows. This structure is fundamentally different from the linear sequential nature of text data. Therefore, it is less intuitive for GeneI models that are mainly trained on text which is unstructured. Another thing to take into account is the context window. Generative models have a limited context window, meaning they can only consider a certain amount of text as input. Feeding a large table with many rows and columns may cross over the context window, and in that case, can lead to incomplete or inaccurate responses. The model is not able to digest a complete table. One approach to handle that limitation is to use a hybrid solution, meaning use a specialized tabular model as a pre processing step that will convert the tabular data into a format that generative models can better understand. This is just one example. As a quick summary, we should be very mindful when trying to feed the GAI system with tabular data. In many cases, it makes more sense to use different AI methods or solutions instead of using GAI. Remember, GAI is optimized to handle unstructured data. 34. S05L06 Hallucinations: Our next topic may sound a little bit strange. Generative AI models may generate information that is incorrect or misleading. This strange limitation is called hallucination. The model is making things up using fabricated facts. The problem is that it is making things up in a very confident, organized and convincing way, like a great politician doing a cross country campaign. This can mislead users into thinking that this is a true baseline. I experienced that issue several times while using GNII for different use cases. The question is why such strange behavior is happening? Well, it may happen for many reasons. Insufficient training data if the model hasn't been exposed to a wide enough variety of data, data quality, Data that may includes incorrect or misleading information. Lack of up to date data. That's the knowledge cut off we discussed earlier. The model's knowledge is based on training data up to a certain point of time. It cannot access real time information or recent developments leading to outdated or incorrect responses, and more. This strange behavior can impact the reputation of a GAI system and it will be harder to trust the output of such models and leverage them safely in a production environment. It's a significant challenge. The good news is that large companies like the big players are improving their foundation models to minimize and mitigate such issues and making those models more reliable and more safer to be used in production environment. Still, they are not perfect and users still need to be aware of such limitations. Not every piece of information we get from a GAI system should be considered as a true baseline. 35. S05L07 Lack of Common Sense: Another interesting dimensions to consider regarding GI is to understand that those solutions are very sophisticated pattern recognition systems. That's it. In many cases, they will lack the common sense that is expected from the average human being. Humans are using common sense that is based on personal experience and knowledge. AI models generate responses based on statistical patterns rather than intuitive understanding. I think that's an important marking point between humans and machines. If you ask a person to help you to break and open a car, he or she will try to understand the overall context and decide if it makes sense to help you do it. Maybe you saw a child that was forgotten in that car. It is a complex scenario and a human being can analyze that situation for many directions. On the other end, AI models possess language based on statistical associations. They generate text by predicting what comes next in a sequence, which is not optimized for complex deeper understanding of real life scenario while considering many things in parallel. Training GAI systems to consider such complex scenarios is a huge challenge. Adding common sense to machines is how. Companies are making progress by putting all kinds of safety protocols and logic, but it's still a major challenge. Maybe in the future, your AI based washing machine will have a common sense until then try to separate the colors of clothes by yourself. 36. S05L08 Bias and Fairness: Let me ask you something. Do you think that everything that is written in a Wikipedia is true and accurate? Well, that's an interesting question. Wikipedia is widely used as a valuable resource. I'm using that and many people are lending on Wikipedia page while searching for some terms in search engines. It's a very popular organic result. Still, we need to approach the content of any website with a critical mindset. Wikipedia is the aggregate result of many people who created and adjusted the content. Wikipedia allows updating content by a wide range of contributors. Each person or contributors has a unique perspective that may be biased in a specific direction. A group of people may have an agenda about something that they would like to promote, even if it's not fully aligned with the real situation. There is also a risk of vandalisms or intentional misinformation. That's just Wikipedia. What about the rest of the Internet? With billions of articles and websites that reflect many biases that exist in society. There are many people and many different opinions on many subjects. Now, when we train a GAI system with such content, the modern knowledge will include a variety of biases on different topics which may cause the model to generate outputs that are maybe unfair, unethical, or even misleading. That's a huge challenge for companies that develop GII models because they need to minimize it as much as possible. Secondly, it's a huge challenge for companies using those GAI models. It can expose them to compliance challenges while they are obligated by law to fulfill certain legal and ethical standards. Again, the good news is that large language models providers are investing huge resources to make sure these models are becoming safer and less biased. They are doing that by using diverse and more representative data for training. In some cases, they employ all kind of tools to identify biases within the training data as well as in the output of the models using all kind of algorithms that can reduce such kind of issues. They will try to monitor the model performance as an ongoing basis to identify and address those issues. And of course, based on that input, perform all kind of ongoing updates to improve the model and to reduce those biases over time. It is a constant work in progress. 37. S05L09 Data Privacy, Security, and Misuse: Data privacy and security is another major concern to consider in the context of GAI systems. Let's talk about a few examples. The first issue is related to data leakage. Think about the situation in which your company is using a variety of third party GAI solutions like HRGPT, Google Jiminy or Microsoft Co Pilot, running in the Cloud, so people working in that company can leverage those tools for their daily work. That's a typical use case, right? What if some users are using a third party tool? And providing as input a prompt with a very sensitive information, like a list of sales deals and revenue or a piece of highly sophisticated code that was developed by the R&D department. There is a huge potential for data leakage while users are trying to leverage those tools. To mitigate those security risks, companies should develop a holistic end to end strategy like using more safer data handling practices, educating end users about the usage of those systems and more. Another important dimension of GEI is the going risk that it will be used by bad players. That's the reality, and it's going to become more challenging in the future. Those models can be used to create this information, generate a deep fake identities, generate sophisticated cyber attacks, and more. Just think that it is becoming harder to identify the credibility of images and videos, we can see. Everything can be generated by AI in a very convincing way. That's going to dramatically change how people rely on digital information. It's part of the game, and we should be aware of those evolving risks. 38. S05L10 Summary: Hi, and welcome back. I hope that I didn't scare you too much about the challenges and limitation of generative AI. That's part of the game when the market is starting to use a new technology, as more market experience is gained, those challenges will be better mitigated. I assume that five, ten years from now, some of them will disappear and maybe new ones will emerge. Let's quickly review them one by one. We started with prompt sensitivity, meaning the input has a direct impact on the output, which makes a lot of sense. We need to be mindful when crafting prompts. The next challenge is the knowledge cutoff. Most of the models are trained up to a specific date. An event or data created after that date is not part of the model knowledge. We need to be aware of that date and the capabilities of the model to handle it. Some of them are updated in repeated intervals and some are closing the gap by using online tools. Then we talked about the less deterministic nature of GEI models. It is embedded in their design to better simulate human creativity, which is useful for many use cases. On the other end, some use cases will require more deterministic behavior. In that case, there is an option in some models to tune that level of randomness. It's called setting the temperature. I also mentioned that GAI models are trained to handle unstructured data. Therefore, we need to be very mindful when trying to feed the GAI solution with tabular data. It may cause strange unpredictable behavior. Moving next, we talked about the challenge that GAI models may generate information that is incorrect or fabricated. It is called hallucination. When the model is making things up, we should always remember that not every piece of information we get from a GAI system should be considered as the true baseline. Another challenge is the lack of common sense. Common sense is the typical capability of human beings, and it's very hard to simulate it with machines. Machines can be manipulated by sophisticated prompts, and we have a big challenge to understand complex real life situations. The next challenge is a big headache for many companies. I'm talking about the ethical and bias issues. When training a GAI system with a variety of data sources, the modern knowledge will include biases on different topics, which may cause the model to generate outputs that are unfair, unethical, or misleading. And the last one was about data privacy, security and misuse. I mainly want to emphasize the risk of data leakage. When using third party GAI solutions, we need to be more mindful on the data we're using as proms, reducing the risk of exposing sensitive information. If the model is controlled by your company, that's a different story. It's a case by case situation. That's a quick summary of this section. Moving next, we will dive into practical use cases of GAI. See you next. F. 39. S06L01 Introduction: Hi and welcome back. We covered a substantial numbers of topics in terms about machine learning and generative AI. I'm excited to start this section, where we'll build on that knowledge to explore the practical application of generative AI and market use cases that are shaping many industries. We are going to see how it's possible to leverage those capabilities to boost efficiency, creativity and innovation. To set the stage and manage the expectations, this section is not about presenting a long list of AI tools and how to use those tools. The AI landscape is constantly evolving with hundreds and thousands of tools available for different use cases and industries. Instead, we'll concentrate on the key use cases where this technology can be used, ensuring that the insights will be applicable across a wide range of AI tools. Eventually, you or your company will select the best AI tools to fulfill specific requirements. All right, let's start to explore the power of generative AI, see you in the next lecture. 40. S06L02 Text Image Video Audio Generation: Generative VI is optimized to digest text as input and analyzed the structure, patterns and context in natural language. It's a very powerful capabilities. That's the input. What about the output? Well, there are a couple of content types that can be generated by GAI systems. The first and most obvious one is text. The model type will be called a text to text model. Text is a very broad format that can hold a variety of different structures. Such models can generate many things like an answer to questions. A list of Ds, emails, articles, stories, reports, scripts, programming code, and more. One of the most popular application of generative VI is the options to synthesize and generate images based on text as inputs. Text to image models. Describe using a prompt the required image, which objects should be included, color patterns, texture, et cetera. Using this text prompt, the AI system will synthesize a new image. Such new capabilities can empower many people that are lacking the skills to create artistic content. It's part of simplifying access to creative tools for a larger audience. For content creator like me, such a text to image models enable the rapid generation of visual elements for many day to day use cases. For example, if I need a specific picture for a new blog that I'm planning to publish, I have another tool in my toolbox. I can use such GAI tools and generate an image based on my specific requirements. I can articulate the required image based on text related to my blog content without having extensive design skills. What about creating a short video clip? That's another interesting direction for applying GAI. Generating a video from text is much more complex than generating images. As a result, it is not developed as image generation, but things are progressing very rapidly. We can use such tools to describe a more complex scenario and it will generate a group of images grouped as a sequence to create movement, creating a video clip. Those are text to video models. Big players like Google, Open AI, and many other companies are exploring this interesting space, and we can assume that production ready solutions will be available soon. Maybe by the time you watch this training, it will be more mature. It's how to estimate, but I think it's going to happen very soon. As an example, think about a short marketing video on some product that will cost a fraction of the cost and time compared to manually creating that clip. It may help many small businesses with less heavy pockets like the big players to generate marketing content. And the last one is about generating audio, meaning using text to audio models. It involves creating audio content such as speech or sound effects from textual descriptions. We can divide them into several subdomains. We have text to speech. That's the technology to convert written text into spoken audio. This one of the most developed application in text to audio with a focus on producing natural sounding human speech. It is getting better. However, it is still a huge challenge to synthesize speech with specific characteristics such as different accents, emotions, and speaking styles. Some content creators on YouTubes, for example, are using text to speech technologies in their content. I'm personally not so keen to watch such contents. It feels unnatural and unreal like someone is cheating while using this technology. I guess in a couple of years, it may be more acceptable and get into the mainstream. Another popular example is the gaming industry, using text to speech to enhance player experience and streamline development, like a real time generation of voiceovers based on player interaction or game events. By the way, it's a big headache in the entertainment industry where it is now possible to synthesize the voice of any popular actor. It is creating challenges related to intellectual properties. Next is text to sound effects, meaning generating sound effects based on descriptions. This is useful for application like video games, movies, and virtual reality. The last one is text to music, creating music compositions, based on textual descriptions. Such models can generate complete songs, melodies and all kinds of things. All those options are rapidly evolving, expanding the possibilities for creating and interacting with audio content based on text input. As a quick summary, we talked about text to text, text to image, text to video, and text to audio. There will be also Sab use cases like maybe text to animation, which is a subset of text to video or maybe text to code, which is related to text to text and more. Let's move on and talk about the two main methods to consume and work with JNAI system. See you next. 41. S06L03 Web Based vs Application Based: Hi and welcome. In the previous lecture, we talked about the main types of content that can be generated using generative AI like text, image, video and audio. As a result, there is a growing amount of off the shelf GNAI solutions for different use cases. The question is how they can be used. There are two main options to consume NAI solutions. The first one is called a web based application. It means that a company encapsulated a GNI solution in a simple web tool a great example is a GII chatbot. A web based chatbot is an automated software application designed to simulate human conversation and interact with user through a web interface. The most popular examples are HGPT, Google, GMI, Microsoft, copilot, and maybe new tools that will be available after recording this training. Those tools are simple to use. They can handle a wide variety of tasks and therefore are becoming very popular. This is the first option to use GII models. It's a perfect solution for consumers. The second option is called software based applications. It is related to organizations that would like to improve their software applications by using GAI. Many GAI capabilities can be embedded as small moduls in a larger software applications. For example, a user just added a product review on the Amazon website. The Amazon website is a combination of many software moduls that are connected to create an end to end shopping experience. In that context, a GAI model can take the product review provided as text and classified as positive or negative review. That's the first. Next, it can extract the key takeaway from the text to use it as a feedback. It can decide how to route this review to the relevant department, like a product of both marketing and sales. This GAI module is embedded inside a larger software application, meaning the Amazon website. Now, how do software models communicate with each other? Well, using APIs, API stands for application programming interface. It is a set of rules and protocols that allows different software application to communicate with each other. A GenaI model will have one or more APIs that are used as the interface to exchange data. It will have an interface to get a prompt as input and maybe a few additional parameters and also an interface for getting the output. That's how developers will integrate such capabilities into their software applications. As a quick summary, I mentioned two main options to consume GAI solutions. The first one is a simple web interface that can be used by anyone, and the second one is related to using GAI as a model in a larger application based on APIs. Let's start to review the key use cases of generative VI, see you in the next lecture. 42. S06L04 Use Case Brainstorm Assistant: Hi, and welcome back. One of the most useful use cases that I experienced while using GNAI is the ability to have a personal brainstorming partner to generate IDs. At any given time, I can open one or more popular chatbots and ask some questions that can help me to brainstorm additional IDs, additional directions, additional perspective to consider. I'm using that capability for a variety of brainstorming sessions. It's not perfect, but in many cases, it is a great starting point to give me some direction. Just keep in mind that it's not a replacement for brainstorming with real humans. It is just another tool in your toolbox. As a simple example, let's assume that I'm planning to create a new blog on a website. As a first step, I will ask a GNAI system to generate a ten, 15 IDs for a block title. I will create an input prompt that describe the main block objectives and what is the preferred title type or structure that I would like to use? The first tool I would like to use is the famous chat GPT created by OpenAI. Let's insert this input as a prompt. Generate 15 IDs for a blog title. The blog is about method to optimize a knowledge base on websites. The title should be engaging and up to seven words. Here we go. After a few seconds, I'm getting that list. Now, even if none of the titles match my mindset, I can still pick up all kind of useful from here, like supercharge, transform, et cetera. Another option if the output does not match my expectations, I may find tune the requirements in multiple iteration, like a ping pong session, I can find you the output by providing a new prompt like narrow the list to five best IDs that can be more useful for search engines optimizations. And I'm getting a new result. Finally, I will take those IDs, select the best matching one or two, and in some cases, adjust and improve one of them manually adding my own personal touch. That's just a small example, but it shows how powerful this brainstorming assistant can be. Product managers can use it to brainstorm IDs for product features based on market trends. A marketing department can use it to brainstorm IDs for marketing campaigns, including slogans, taglines and promotional strategies. There are so many use cases that it can be used. As a small tip, when writing a pump, try to provide enough context, relevant background or any specific requirements. Don't assume that the GII system has that information. If you don't provide it, the system will use generic assumptions, and then the output will be more generic. The second thing to remember is that it is an iterative process. You can keep refining and tuning your pumped until getting the required output. We can find endless opportunities for that use case. It is one of the low hanging foods for GAI. Nothing special is needed to use it, and I encourage you to add it to your daily toolbox. Great. Let's move to the next use case. 43. S06L05 Use Case Summarization: The next useful use case of GNAI is the ability to summarize text. In that case, the LLL model is used as a logical reasoning engine instead of creating content. We can provide a long text from some article as an example, and ask to summarize that text in a specific way like defining the number of paragraphs or pages, bullet points, and more. And to be honest, it can do a pretty well and impressive job. It can understand context, identify key points in the text, and then produce a concise, structured summary that can nicely capture the essence of the original text. Is that a replacement for reading a complete article? No. We need to remember that it is not perfect. It can miss some complex nuance in the text or maybe drop important parts or maybe create two generic output while omitting specific critical information that is essential for a complete understanding of the text. The quality and accuracy of the model output are based on the specific model capabilities being used. A more powerful model means it has better logical reasoning capabilities to handle a text. As a small tip, I'm using more than one GNAI tool in parallel for some tasks, so I will be able to compare between them and sometimes combine the output. Different models will generate different summaries. One major limitation to consider is related to the number of tokens. A GAI model will have a maximum token limit for the combined input and output. I cannot provide 100 pages as a complete book. As input. If the input text is too long, the model will not process the entire text leading to incomplete or inaccurate summaries. In that case, we can consider breaking the original text into smaller elements that fits within the token limits. If this is a book, we can consider using the chapters to break it into smaller chunks and then summarize that per chapter. Let's move into the next use case. See you next. 44. S06L06 Use Case – Text Enhancement: As you probably know, many great software tools can be used to correct grammar mistakes and enhance the text structure. Those tools can be embedded as extensions in our operating system, web browser or a text editor I quote. Those tools use JNAI as the engine to analyze the text and provide suggestions and corrections. The downside is that those tools cost money. We need to pay for monthly or maybe yearly subscriptions. If I'm creating content on a regular basis, that's probably a good investment. However, if I need it for a one time project or to use it with a lower frequency, it may be less attractive. A nice alternative is to use those popular chat board GEI tools. Many offer the options to use them for free with some limited capabilities. I hope it will not change in the future. We can basically copy paste the text we want to enhance as an input prompt and explain what we need like fixing grammar mistakes, making the content of the text more exciting, changing the flow of topics and so on. Those are very powerful and versatile tools that we can control by providing the relevant prompt. I'm using this option for a variety of content that I'm creating like writing content for lectures, blogs, articles, product description, and more. My method is to use text enhancement at a later stage when I have almost a final draft, but it's completely up to you. The reason is that I want to maximize my personal human touch, making sure that my mindset influence most of the content. Otherwise, I may get content that has a high portion created by AI. Secondly, remember the sentence, garbage in garbage out. If you provide as an input a very early draft of your text, then most probably the output will be too generic. All right, let's move to the next use case. 45. S06L07 Use Case Code Generation: The next interesting and popular use case for a GII system is code generation. I would like to emphasize that it's not just for pew developers, let me explain that concept. There is a growing number of high tech jobs that require basic programming capabilities in different programming languages. It is becoming useful, even if it's not the mainstream objective of specific job. Some level of flexibility is required. The capability to generate code using GEI is making programming more accessible and more tangible options for people who are not pew developers. They don't have the time or resources to spend two years to learn a specific programming language. Sometimes they just want to get the job done without drilling down to every line of code. As an example, maybe someone is a data analyst and SQL will be useful for extracting data from different data sources. If you're not familiar with databases, SQL is the most popular language to store and extract structured data. If this person is not using SQL daily, it will be hard to maintain knowledge and experience to quickly use it. It may take a substantial setup time to remember the syntax and find the right solution while considering several options. That's a sweet point for a GAI system. This data analyst can use it to provide the structure and syntax of the best matching SQR query for a specific job. Maybe the provided answer will not be perfect, but it's a great starting point to manuate unit. It is a great framework to quickly come up with a couple of solutions. I would like to share my personal experience as an example. Don't tell anyone, let's keep it between us. I'm not a developer, but I have some background in computer science due to my experience in practical data science projects and all kinds of other site projects. Now, I don't know if you are aware about it, but WordPress is one of the biggest open source frameworks for creating websites. Millions of websites are based on WordPress. As part of my entrepreneurial spirit, I decided to create a software program, which is a WordPress plugin. A WordPress plugin is a piece of software that extends the pu Awodpress core capabilities. There are many plug ins for different use cases. Now, it was a site project and I decided to build it from scratch, building my knowledge step by step. The challenge was that I did not have enough knowledge or experience in web based development languages. I guess it sounds a little bit crazy and I totally agree. Still, I decided to go with that project. As part of the preparation, I learned the basic syntax for a couple of languages, and then I divided this project into little more manageable pieces each time I handle a specific piece of the puzzle. As you may guess, I used JNAI to help me come up with IDs, required code syntax, examples of best practices and more. It was very useful, and I can share with you that I'm not sure I could handle that project effectively without leveraging JNAI. It helped me to speed up the development stage. Now, is that a perfect tool for any job? No. It is just another tool in our toolbox. For example, during the development phase, a new version of WordPress was introduced with new capabilities. That's the basic nature of software. Those capabilities can impact the way you develop a plugin as part of the WordPress ecosystem. However, the GeneI solution that I used had a knowledge cut off. It was trained until a specific certain point in time, and it was missing up to date new capabilities. It took me some time to understand those limitations. My prompt about the question that I have was clear like the blue sky on a nice day and still it was not working. I was not sure why I'm not getting the required output and why the GAI system is generating unrealistic solutions based on my prompt. It has a knowledge cutoff and I could not wait a couple of months until this model is updated or not. I close that gap by using the good old Internet, meaning using search engines, blogs, articles, specific forums, books, and more. Another thing to consider is the complexity of the code you would like to generate. A typical software application is based on multiple modules, multiple layers that interact with each other to create an end to end solution. It is a complex architecture with many moving parts. If I will try to explain a GAI model, the requirements of a complex application, it will need hundreds and even thousands of lines as an input prompt to explain all the features and functionalities of that application. Now, a typical GI tool will not be able to process such level of complexity and generate an end to end software tools. That's not going to happen, at least in the near future. We need to break the required application into little pieces and then use GEI to handle one piece at a time. When using this approach, we increase the probability to get more focused and useful output from a GAI tool. The last thing I would like to share is pure golden tip. When handling more complex projects, sometimes it is much better to ask a real developer or real expert directly or indirectly using a form. Don't try to rely too much on GAI for cogeneration. It can take you forward up to some point. One important thing I would like to mention is that the ability to generate code is becoming an integrated capability in software applications that are used for software development. They are called ID, integrated development environments. Those tools use GAI and other machine learning capabilities to suggest code snippet and code completion generate boilerplate code as an example or maybe write a complete function or module about something. Those AI driven capabilities have a dramatic impact on the speed of developing and testing software applications. Great. That's about co generation. Let's move to the next use case. See you next. 46. S06L08 Use Case – Content as a Framework: Hi and welcome. We covered some key use cases of GEI like brainstorming IDs, creating a summary of a long text, and generating a piece of code. That's great. But GNAI can do a little more than that. We can use GenEI to write more comprehensive content, like a blog on some topic, a story, an article, a script, a support answer to a customer, a press release, a post, a product description, and more. Let me show you a quick example. I will ask a Google Gemini to generate a blog about best practices to optimize performance for websites. The blog will be divided into six main topics. Each topic will be around half a page long. That's the input. Here's the generated blog. It's a very impressive output. Now, it is very tempting to copy paste and pose that synthesized blog. It looks professional nicely structured with a good selection of words. Is that my blog? No. Nothing related to my writing style or ticking poses. Can anyone else generate the same content using the tool? Yes, of course. It's not unique. It is a generic content. I don't think it is professional or ethical to publish content that was purely generated by GAI Okay, listen, on my perspective, we may assume that search engines are not smart enough or the end user will not notice what is the source of the content. But eventually, they will figure out that this blog is completely generated by an AI solution. There is no soul to that content. It is missing the human touch. Therefore, I would like to emphasize an important point. In the context of generating more complex text like a blog or an article, I suggest using NAI as a starting framework for a draft version. We should avoid using it as the final output. In our example, I will take that blog content as a draft, read every sentence and create a new version with many adaptations, remove things that are less relevant and add things that are more important. I will also change the writing style to match my personal perspective, remove too much fancy was that I will probably never use, and more. Try to make the content your content. I think you got my point. Let's move to the next lecture. 47. S06L09 Use Case – Images on Demand: Hi and welcome back. Until this point, we talked about the ability to generate text for many useful tasks. However, NAI is much more than that. We talked about the capabilities to generate other types of content like images, video clips, and audio. It is quite an amazing direction, and as you can imagine, that's going to reshape the art and design industries. It is now possible to generate interesting visual and audio elements by providing text. It simplified the usage of tools and the speed to get visual or audio content. I'm heavily using images for different content that I'm creating. It can be inside the presentation, a cross lending page, a published blog, a report about something, and more. Usually, I will go to some free or paid websites that are providing high quality images and start to search using keywords. Sometimes it's a quick process and sometimes it's very long and slow. This may happen because it is taking me time to find the specific image that can fully articulate the message I would like to deliver. You remember the sentence, a picture is worth thousands words. That's another interesting sweet point of GNAI. Instead of searching hundreds of images using keywords, let's describe what is needed and the GAI system will generate it. We can call it generating images on demand. Let's see a simple example. I will provide a prompt, generate an image of a racing car from the future based on a color scale between blue and black. The car should take around 30% of the picture. Here we go. Let's tune it a little bit. Reduce the color size by x percent and change the colo scale two between red and black. Amazing. I recently started to use image generation, and it's making my life easier. The process of creating images based on my specific requirement is just mind blowing, and it saved me a lot of time. As always, it is just another tool in my toolbox, and I'm still heavily using regular images. I still like to use realistic images. It is not a replacement. It is another option. 48. S06L10 Use Case – Boosting AI Based Apps: Hi and welcome. All the use cases we covered so far seems like practical options for individuals like you and me and millions of people around the globe. Each one of us can access the power host of a GAI engine using a simple web based interface. Just type a prompt and get the required output. However, this is a small fraction of the possibilities of using GAI let's zoom out and talk about the business world. There are a variety of business domains, finance, transportation, healthcare, technology, manufacturing, retail, energy, education, construction, telecommunication, entertainment, and more. Each business domain has a long list of processes and workflows being used to run the business functions, functions like marketing, sales, operation, finance, human resource, and more. For example, in the retail domain, let's take the process of selecting a product online and performing a purchase. It is based on a workflow, a chain of steps that are handled by a variety of software tools. As you can imagine, each business workflow, each process, and each step are candidates for integrating generative AI modules as part of a larger application. GNAI can be used to boost many AI based applications. Back to our example, when a customer select a product, a JAI module will try to recommend additional products and services for that specific customer based on the selection. It is embedded and integrated as part of that end to end workflow. That's just a small example in the retail domain. The amount of possible use cases for those integrated applications with generative AI is huge, and we will see a growing number of companies and businesses that are trying to implement GAI in different places. That's going to be the main direction for generative AI. There is a huge potential for business innovation, and as we know, nobody would like to stay behind. The races on, I assume that in the upcoming five, ten years, many companies will develop different strategies to integrate and leverage GNAI. 49. S06L11 Best Practices for Prompts: Hi and welcome. We covered many use cases of GEI. All of them are based on using a text pumped as input. The Pompt is the main entry point or interface for getting the required content from GAI systems. As a reminder, the pumped can be used directly in web based GNAI solutions like hA GPT or Google Gemini, or as an API request when integrated as part of a loger application. It is still based on providing a pumpt. If the prompt is so important and it has a direct impact on the quality of the output, we better invest a little bit to design and optimize our prompts. We already mentioned that. It's called prompt Engineering Engineering better prompts. In this lecture, let's review a couple of simple tips related to prompt engineering. Number one, be specific and clear. The first one is to clearly define what you want the AI to generate. Try to be specific and clear as much as possible and state your expectations. It is important to construct a pmt that describes the required task in title while providing relevant background, specific assumption and requirements. Number two, use contextual information. The second tip is about how much it's important to provide context that helps the model understand the scenario or background. We need to consider what will be sufficient background information to complete the task. If I'm asking a model to create a short story about a topic, then contextual information will be to define the main characters, explain the situation, and set the overall theme or mood of that story. Another example is specifying the target audience for the content you would like to generate. For example, write an article about data privacy for IT professionals or explain the concept of saving money to a 10-years-old child. By providing the context, the GAI can better tune the response. Moving to the next tip, define the scope and boundaries of a task. Let's assume I would like to get a list of questions for a quiz related to some text. I will define the scope of the task like generating ten questions for each question generating up to four to five possible answers. One question should be based on yes or no answers. That's examples of defining the scope. Another thing to consider, ask for multiple options. If you're exploring different possibilities, ask the AI to generate multiple version or options. For example, provide three different slogans for a new eco friendly food product. Specify whether you need a short, concise answer or a longer more detailed response. For instance, write a two paragraph summary or give a one sentence answer. Okay? That's all kinds of examples. Our next tip is related to the prompt and response structure and format. Try to organize your input prompt in a logical structure like explaining a human being about something. Try to specify the desired format. What do you want to get? You would like to get text, code or image. If you need the output in a particular format like a bullet points, make sure to include that in your pump. It is also useful to provide examples to illustrate the desired output. This helps the model understand the format, tune, and level of details you're expecting to get. In many cases, it is even much quicker way of providing extra content and writing a comprehensive explanation of the desired result. Just give it an example. Number five, avoid confidential information. I already mentioned that as part of the section about challenges, but it's good time to emphasize that again. When using a third party GII model, we should always be careful and more mindful about what kind of data we provide as part of the prompt. Avoiding providing any confidential information, as part of the prompt. Number six, request for simplification and clarification. If you need the GII system to explain complex concept, ask it to simplify or clarify the information. For example, explain quantum computing in simple terms for a beginner. Number seven, ask the model to consider different viewpoints or alternative to generate more comprehensive and diverse response. For example, describe the benefits and drawbacks of working from home. So this is two different perspective on the same topic. Number eight is highly important. Break down complex tasks. Most GII models will not be able to digest, understand, and generate good responses for a very complex task. As you recall, when I talked about generating code for a complex project, it is more practical and useful to break a complex task down into smaller tasks. This approach allow us to better zoom in on a specific task, generate a good prompt to explain what is needed and increase the quality of the output. Anytime you have a complex task to handle, try to break it down first and then utilize AI tools. Next one is the last one, iterate and refine. In a typical conversation between two people, where one person asks something with a group of sequential questions, each question is used to refine the original question so the other person can fully understand the requirements and overall context. The same concept should be used while using a GEI solution, especially when you are using a solution like CharPT, we can ask something with a group of sequential iteration. Sometimes our first pumped will not be perfect to get the required response. It is based on experience while working with those tools. In most cases, we will need a multi step process. In each step, we'll try to guide the model to provide the required output. This is tip number nine. All right. Those are the key tips I wanted to share. Thanks for watching so far. Let's summarize this section. 50. S06L12 Summary: Hi and welcome. Let's quickly summarize the key takeaways from this section. We started by reviewing the main types of content that GenEI models can generate like text, image, video and audio. Each content type is a complete category that can be broken down into a variety of formats. We also talked about the two main options to consume and use GEI models. The first option is web based like hAGEPT. That's the most popular option for consumers like you and me. The second option is application based, meaning using GEI models inside other applications. That's going to be a huge innovation way for businesses and organizations. Moving next, we reviewed the most typical use cases for leveraged GEI in our daily work. Those are the low hanging foods of that technology that can be used by anyone. The first one is to use it as a personal brainstorming assistant, meaning generating IDs, thinking direction, new perspectives, et cetera. The next use case is the ability to summarize text. It can understand context, identify key points in a text, and then produce an eye summary. It's a great way to speed up the work of analyzing some long articles. We can also use GAI models to enhance an existing text, which is more related when we would like to generate new content. Just copy paste the text we would like to enhance as an input prompt and explain what we need, like making the content of the text more exciting or changing the flow of topics. Okay? There are many things that we can ask the GAI all kind of way to enhance the text. Next one was about code generation, which is becoming a very popular use case by making programming a more accessible option for a wider audience. Sometimes we just want to get the job done without drilling down to every line of code, and we need some examples to quickly remember the syntax of a particular programming language. The next one is more heavy usage of JAI by using it to generate full content about something. I decided to use the name content as a framework to emphasize that it's not supposed to be replacement for real human creativity. I would be used as a draft, a framework that will be used to develop something unique, useful, and less generic. I also mentioned the interesting use case to generate images from a text description. I decided to call it generating image on demand. That's something that can be very useful for content creators. Instead of searching for a specific image based on keywords, we can describe our requirements and get a unique synthesized image. The last use case is probably the biggest one. I'm talking about boosting AI based application with GAI. It means that the GAI model is embedded in a larger enterprise level application. The market potential to integrate GAI in almost any business domain is huge, as there are millions of applications that can leverage NAI. The last lecture was used to review some best practices for crafting more effective prompts when working with GAI models. We should be specific and clear with the required task, provide any relevant backcount information. If applicable, try to scope the task, define the expected structure and format of the response. Avoid sharing any confidential information. Try to break down a complex task into several simple sub tasks. Finally, iterate and refine our requirements with sequential prompts. That's all for this section. See you next. 51. S07L01 Let's Recap: Hi and welcome to our last section. Our training is almost at the final stage. Thanks for watching so far. I hope it was interesting, as well as useful for you. At this point, I would like to recap the key terms and topics we covered while trying to make it more of an end to end story, so it will be easier to remember. We started by defining AI, artificial intelligence as the human desire to create a digital brain and mimic human intelligence, so machines can perform more and more complex task. As a result, AI is not limited to a specific group of domains. It is a general purpose technology that can be used almost anywhere. Over the evolution of different technologies and their impact on the AI landscape, machines have been increasingly sophisticated. However, something was missing. Machines were missing the basic capability to learn and improve, which is a foundation capability of human intelligence. That's where machine learning algorithms were able to boost AI into new frontiers. Instead of building a software program that is preprogrammed with fixed rules and knowledge, it is possible to create a system that can dynamically digest and learn patterns from data. Now, to be able to digest and handle more complex patterns, deep learning was introduced while using artificial neural networks that are inspired by the human brain using layers of interconnected networks for a long period, those machine learning methods were focused on specific tasks like a prediction, classification and clustering of data while doing a pretty good job generative AI added the important capability to analyze text as a language and to generate creative content. That's a breakthrough in the AI industry, making us a step further while trying to mimic human intelligence. After building the main pieces of the AI puzzle, we move to the next section as a soft introduction to the key terms in machine learning. Machine learning is the foundation of all those amazing technologies. We talked about using the input and output books illustration to describe ML solutions. Those solutions can be divided into four categories, prediction, classification, clustering, and content generation. Any machine learning box must be trained to perform a specific job, and that's part of the training phase. Doing that phase, an algorithm will consume training data and use it to optimize the trained model parameters to better map the input of the box to the output of the box. The data going inside of an ML box can be divided into several main data types, structure, unstructured and semi structured data. When zooming on the data input of an ML box, we will find features. Features are the small elements of the data input. The selection and transformation of features is a critical step to make sure the model is getting the right data. We also mentioned the main types used for training model, supervised learning, unsupervised learning, and reinforcement learning. All of them are useful for different use cases. As soon as we managed to create a good knowledge of machine learning, we move to breaking down the generative AI concept into small pieces. We started with the main building block of any GNAI solution, the artificial neural network created using deep learning. It is the internal structure to hold the knowledge of the model. As part of the evolution of different deep learning architectures to build ural networks, the transformer architecture was introduced, adding the capability to process data in parallel instead of sequentially, speeding up the training process. The transformer architecture was a perfect solution to train complex models based on huge amount of data. However, building complex models require substantial computing resources with high price tags. It is an expensive resource intensive project. As a result, it is a playground for the big players, not small companies. Those players can leverage the resources to train large models and make them available to the public based on different services. And that's the concept of foundation models. A foundation model is a generic model that can be adapted and tuned to a wide range of tasks. One of the most popular types of foundation model is the LLM. Large language models. LLMs are the core capabilities of generating VI to handle text as input and output. The input of the LLM is called the prompt and it is boking down into small elements called tokens. Tokens are numbers used to map the text into a numerical format. The number of used tokens is measured as a metric for service consumption. In addition, the context windows is a limitation of the maximum number of tokens that can be handled as a group under the same context. Using this knowledge, we managed to reveal how those LLMs are generating a complex text response. It's all about predicting the next token in a sequence, one by one, in a loop to create complex patterns like sentences and paragraphs and so on. How does the model predict the next token? Based on statistical distribution of possible predictions and selecting one token. How does the model know to calculate the statistical distribution of possible predictions by consuming massive amount of text data and using the self supervised method. What are the options to influence the output of a foundation model like LLM? We mentioned three options. One, contextual prompting by providing more context as part of the input prompt. Retrieval argument generation, a more complex solution, where the solution is leveraging internal databases to enrich the input prompt. Number three, fine tuning, where we take a predefined model and retrain it with new data to create a new fine tuned model. The next phase as part of our learning journey was to review some of the key challenges and limitation of generative AI. We talked about prompt sensitivity. Any noise we put inside will be amplified by the GII system, so we should be mindful of the quality of the input prompt. Knowledge cutoff is the last date that the model was trained on. Any event or data created after that date is not part of the model knowledge. That's something to consider while using a specific model for a specific use case. We should be aware that many models are less deterministic by design to make them more creative. And in some cases, it is possible to tune the level of randomness to make them more suitable to specific use cases. GII models are trained on unstructured data. So they may have limitation to handle structured data like tabula data. We also need to remember that GII system lack the common sense expected from a human being, especially while handling complex situation. When training a GII system with a variety of data sources, the model knowledge will include biases on different topics which may cause the model to generate outputs that are unfair, unethical, or misleading. Those models are getting better while handling those challenges, but we should always be more careful and mindful when using content generated by a GAI system. The last section was dedicated to reviewing the most typical use cases of using GAI, brainstorming Ds, summarize and enriched text, code generation, using that for content as a framework, images on demand and the last one was boosting AI based applications. And the last lecture was used to review some best practices for crafting more effective pumps when working with GII models. That's our end to end story. 52. S07L02 Thank You!: Wow. You reached our last lecture, and that's great. I want to thank you for watching the complete training. You're more than welcome to visit again and refresh your knowledge on specific topics and check if I released some interesting updates. I hope that you enjoyed the training and learned some interesting things along the way. My main objective is to trigger your curiosity about generative AI and hopefully help you to keep learning and develop your skills. That's the future, and it is a great opportunity to break down into new evolving domains. My last request is to get your important feedback. It will be awesome and useful if you can spend two, 3 minutes to rate the course inside the platform and share your personal experience. Each review is important. Secondly, feel free to share your experience and achievement on social media like LinkedIn. Just tag my name, Idan Gabriell. That's it. Thanks again for joining this training. I hope to see you again at other training courses that I have or going to release. Bye bye, and good luck.