Gen AI 0-100: From Basics to Google Cloud Tools | Reza Moradinezhad | Skillshare
Search

Playback Speed


1.0x


  • 0.5x
  • 0.75x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 1.75x
  • 2x

Gen AI 0-100: From Basics to Google Cloud Tools

teacher avatar Reza Moradinezhad, AI Scientist

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

    • 1.

      Intro to GenAI

      2:50

    • 2.

      L1V1 - Traditional AI

      18:42

    • 3.

      L1V2 - Machine Learning

      19:10

    • 4.

      L1V3 - Deep Learning

      14:55

    • 5.

      L1V4 - Discriminative vs Generative

      11:54

    • 6.

      L2V1 - Transformers

      7:53

    • 7.

      L2V2 - Gen AI

      14:52

    • 8.

      L2V3 - Gen AI Applications

      12:47

    • 9.

      L2V4 - Prompt Engineering

      11:24

    • 10.

      L3V1 - LLMs

      6:41

    • 11.

      L3V2 - LMM Benefits

      10:23

    • 12.

      L3V3 - Examples of LLMs

      10:13

    • 13.

      L3V4 - Foundation Models

      12:51

    • 14.

      L3V5 - LLM Development

      5:23

    • 15.

      L3V6 - Tuning LLMs

      8:54

    • 16.

      L4V1 - App Sheet

      6:56

    • 17.

      L4V2 - Gen App Builder

      6:11

    • 18.

      L4V3 - Maker Suite

      10:16

    • 19.

      L4V4 Generative AI Studio

      12:56

    • 20.

      Project Demo- App Sheet No-code App Builder

      12:06

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

52

Students

--

Project

About This Class

Welcome to the Comprehensive Generative AI Course – your ultimate guide to mastering the exciting realm of Generative Artificial Intelligence. This course equips you with the essential knowledge and skills to thrive in this rapidly expanding field, which is projected to reach a value of $100 billion in the coming years.

Featuring hours of in-depth content, world-class slides, and an array of valuable resources, this is the most detailed Generative AI course available, focusing on the fundamentals of this groundbreaking technology. Whether you have zero programming experience or are looking to deepen your understanding, this course will take you from beginner to expert. Here’s why:

  • The course is taught by a PhD in computer science, with numerous publications and years of experience teaching at universities around the world.
  • You'll be working with the groundbreaking tools and technologies used by major companies like OpenAI and Google.
  • No shortcuts are taken—expect engaging presentations, quizzes, hands-on projects, downloadable resources, articles, and more.

Developed through years of university-level teaching, the curriculum has been refined based on student feedback and real-world testing.

We've taught thousands of students how to code, many of whom have gone on to become professional developers or start their own tech ventures.

Through step-by-step video tutorials, you'll gain everything you need to succeed in the field of Generative AI.

The course includes multiple hours of HD video tutorials and reinforces your learning with practical assignments.

Key topics covered in this comprehensive course include:

  • The role of prompt design in generating specific outputs
  • Foundation Models and their influence on Generative AI
  • Building apps with AppSheet and integrating Generative AI
  • Different types of Generative AI models (text-to-text, text-to-image, etc.)
  • Code generation using tools like Bard (Gemini), ChatGPT 3.5, and GPT-4
  • Large Language Models (LLMs) and their advantages
  • Prompt Engineering techniques for LLMs
  • Building conversational AI engines
  • The importance of deep learning in AI
  • Task-specific models in the Model Garden
  • Model training and deployment with Gen AI Studio and Maker Suite
  • Creating custom Generative AI applications

By the end of the course, you'll be equipped to apply Generative AI techniques to bring your creative ideas to life and develop innovative solutions across various industries.

Join us today and look forward to:

  • Animated Video Lectures
  • ~4 Hours of Instruction from a University Professor
  • Hands-on Generative AI Assignments and Projects
  • Quizzes and Practice Opportunities
  • Tailored Generative AI Articles
  • Over $1000 worth of Generative AI course materials and curriculum

Who is this course for?

  • Anyone interested in understanding the rapidly evolving field of artificial intelligence with creative potential.
  • Those curious about exploring the fascinating world of Generative AI and its innovative applications.
  • Individuals looking to harness the power of Generative AI for business purposes.
  • Learners who want a single, comprehensive course that covers everything they need to know about Generative AI.

Meet Your Teacher

Teacher Profile Image

Reza Moradinezhad

AI Scientist

Teacher

Hello, I'm Reza.

I am passionate about designing trustworthy and effective interaction techniques for Human-AI collaboration. I am an Assistant Teaching Professor at Drexel University College of Computing and Informatics (CCI), teaching both undergraduate and graduate level courses. I am also an AI Scientist at TulipAI, leading teams of young students, pushing the mission of empowering media creators through ethical and responsible use of Generative AI.

I received my PhD in Computer Science from Drexel CCI. My PhD dissertation focused on how humans build trust toward Embodied Virtual Agents (EVAs). I have collaborated with MIT Media Lab, CMU HCII, Harvard University, and UCSD, publishing and presenting in venues such as Springer Nature, ACM CHI, and ACM C&C. I have been re... See full profile

Level: All Levels

Class Ratings

Expectations Met?
    Exceeded!
  • 0%
  • Yes
  • 0%
  • Somewhat
  • 0%
  • Not really
  • 0%

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. Intro to GenAI: There are a lot of courses out there on generative AI. I spent a lot of time going over many of them because I wanted to make sure that in this course, I'm giving you all the fundamentals that you need to completely understand what generative AI is. And on top of that, I'm going to give you some practical examples, some hands on demos on different tools that use generative AI that can help you today. I'm Professor Reza, and I teach undergraduate and graduate students topics on computer science and artificial intelligence. I also have thousands of online students. I have done research on AI, and I collaborated with prestigious institutes like MIT Media Lab, Carnegie Mellon University, Harvard University, and University of California San Diego. And the results of those works have been published in venues like Springer Nature and ACL. I'm going to use all of that experience and everything else that I've learned in all of these times to let you know how you can understand the transformation from traditional AI to general AI. This course is divided into five different sections. In the first section, we will cover traditional artificial intelligence. We will give a definition of what artificial intelligence is. We also cover what machine learning is and discuss different types of machine learning, including unsupervised learning, supervised learning, and reinforcement learning, and we will also discuss deep learning and the difference between discriminative deep learning and generative deep learning. In the second section, we will discuss how to distinguish between generative AI and traditional machine learning. Then we will talk about generative AI and we will provide some examples of generative AI. We will discuss what transformers are and how they change the game in artificial intelligence. We will also cover topics such as prompt engineering and foundation models. Then we will discuss different types of generative AI and in this section with some examples of code generation using AI. In section three, we will discuss large language models. We will provide an introduction to them and also provide a comparison between LLM and generative AI, and we will also discuss the benefits of LLMs. In Section four, we will talk about different types of tools that Google Cloud provides us so we can use generative AI for our own projects. In the last section, I will provide a demo on how to build them up using generative AI without writing a single line of code. So if you're excited about learning what generative AI is and how you can use it in your daily life, let's dive in. 2. L1V1 - Traditional AI: In this video, we provide an introduction to traditional artificial intelligence. Artificial intelligence is a discipline like physics or chemistry. It is a branch of computer science that deals with the creation of intelligent agents, which are systems that can reason, learn, and act autonomously. In a more formal way, AI is the theory and development of computer systems able to perform tasks normally requiring human intelligence. One of the subfields of AI is machine learning. Machine learning is a program or system that trains the model from input data. That trained model can make useful predictions from new or never before seen data drawn from the same one used to train the model. Machine learning gives computers the ability to learn without explicit programming. Another subfield of AI is deep learning. Deep learning is a type of machine learning that uses artificial neural network. Artificial neural networks are inspired by the structure of the human brain, and they can be used to process complex patterns that traditional machine learning algorithms cannot. We will discuss machine learning and deep learning in more details later on in this section. But before that, let's give an overview of AI. The rest of this video is structured as follows. First, we will provide a real life example of using artificial intelligence. Then we will provide a brief history of AI. Next, we'll try to understand what artificial intelligence is, and then we cover different types of AI, different applications of AI, and also we discuss how the future of AI will look like. When we talk about AI making our lives easier, smart homes are a great place to start. Here's how it works. In a smart house, we've got home appliances and voice activated sensors. They're like your own personal assistant tweaking the light and air conditioning to match the weather outside. Then there's the security system. It's always on the lookout detecting any unusual movement outside and alerting you immediately. Here's the really cool part. All these appliances talk to each other. They're connected and can even communicate with your car. For example, opening the garage door when you enter your driveway and to top it off, you can manage all these appliances from your phone wherever you are. There's so much that AI can do for us. But before getting distracted by the applications, let's take a step back now and dive into how artificial intelligence came to be. Here's a timeline of artificial intelligence. In 1950, Alan Turing came up with the touring test, a test of a machine's ability to exhibit intelligent behavior equivalent to or distinguishable from that of a human. In 1956, John McCarthy coined the term artificial intelligence and organized the Dartmouth Summer Research Project on Artificial Intelligence, the first conference on AI. In 1969, Shake the Robot was built, the first general purpose mobile robot. Although simple by today's standards, Shake marked a milestone in AI development by demonstrating the ability to process data and perform tasks with a purpose. In 1997, Deep Blue defeated world Chess champion Gary Kasparov. The first time a computer had beaten a human at a complex game. Deep Blue's victory was a major breakthrough for AI, demonstrating the ability of computers to learn and adapt. In 2002, the first commercially successful robotic vacuum cleaner was introduced. And for the decade 2005-2015, we saw the development of a number of new AI technologies, including speech recognition, robotic process automation or RPAs, dancing robots, smart homes, and self driving cars. In 2016, AlphaGo, a computer program developed by Google Deep Mind, defeated World Go champion Lee Sidle. Alpha Go's victory was a major milestone for AI, demonstrating the ability of computers to master complex strategic games. In 2017, transformer Technology was introduced in a paper called Attention Is All You Need. Transformer technology is now widely used in natural language processing tasks such as machine translation and text summarization. In 2020, GPT three, a large language model developed by OpenAI was released. GPT three is capable of generating human quality text, translating language, and writing different kinds of creative content. And finally, in 2023, Google Cloud Gen AI tools are released, providing a suite of tools for developers to build and deploy AI applications. In the same year, Bart, a large language model developed by Google AI was released. Art is capable of answering your questions in an informative way, even if they are open ended, challenging, or strange. This is just a brief overview of the history of AI. AI is a rapidly developing field, and new advances are being made all the time. It will be interesting to see what the future of AI hold. We've been quiet a journey with artificial intelligence. Starting from the 1950s with Turing's groundbreaking tests, the term artificial intelligence was coined in 1956, and a new era began. Over the years, we witnessed milestones like the creation of our first general purpose mobile robot Shaki in 1969. By 1997, computers were defeating chess champions, and now here we are in 2023, witnessing highly sophisticated language models like GPT three and Bar. It's a bit like the computer revolution of the 80s, but this time, it's all about AI. Mastering these new and powerful tools is becoming more crucial as the pace of advancements in AI is accelerating. The potential is immense, much like it was for those computer visas in the 80s. So buckle up for this exciting journey ahead. It's not just about watching what the future of AI will bring, but being a part of shaping that future ourselves. Understanding artificial intelligence. AI is a branch of computer science that creates a smart machines capable of human like tasks like speech recognition, object identification, learning, planning, and problem solving. Remember the computer that outsmarted the chess champion or the one that controls your houselights? That's AI solving problems, just like we do. Our current understanding of AI is largely based on how it interacts with us and how it compares to human capabilities. Things like speech recognition and object detection are big in AI today. It's about the ability to absorb information, learn from it, and use it to plan and tackle future tasks. A very human like activities. This, in a sense, is the magic of AI. To understand AI correctly, we need to understand three concepts, different types of AI, different applications of AI, and different possibilities for the future of AI. Now, let's start with different types of AI. There are many different types of AI, but they can be broadly classified into four different categories. The first one is reactive AI. Reactive AI systems can only respond to the current state of the world. They do not have any memory or past events, and they cannot plan for the future. An example of that is a chess playing robot that only follows a set of logical instructions and reacts properly based on the opponent's move. The second type of AI is limited memory AI. These systems can remember past events and use this information to make decisions. However, they cannot reason about the future or understand the intentions of other agents. An example of that could be a MPs application that suggests places to eat based on your previous visits. The third type of artificial intelligence is theory of mind AI. These systems can understand the thoughts and intentions of other agents. This allows them to cooperate with other agents and to achieve goals that would be impossible for a single agent. One way that theory of mind AI could revolutionize the way we interact with machines is by creating robots that are able to provide companionship and support to people who are lonely or isolated or through virtual assistance that are able to understand our needs and provide us with the information and assistance that we need. The fourth type of AI is self aware AI. Self aware AI is a hypothetical type of AI that would be conscious and have its own subjective experiences. Suppose we have a personal robot assistant named Eve. If Eve were a self aware AI, she wouldn't just follow preprogrammed instructions or react to our commands. Instead, she'd understand her own existence and have her own feelings and thoughts. For instance, if we ask Eve to fetch a book from the library, a regular AI would just calculate the shortest path and go get the book. However, a self aware AI like Eve might think about whether it's a nice day for a walk or ponder if she's fetched too many books lately and suggest an e book instead. It's important to note that this kind of self aware AI is purely hypothetical at this point. Some researchers believe it's a future possibility worth exploring. Now, let's dive to applications of AI. Artificial intelligence is not just about imitating human capabilities, but also about augmenting our abilities and enhancing efficiency in many different areas. From transportation to healthcare, financial services to customer support, and education to entertainment, AI's potential seems to be limitless. Here are some examples of how AI is being used today. Self driving cars. AI is revolutionizing the way we travel. It is powering self driving cars that can navigate roads and avoid obstacles autonomously, leading to safer roadways. Medical diagnosis. In healthcare, AI is making a strides in diagnosis disease, often outpacing human doctors in accuracy. By analyzing vast amounts of medical data, it helps identify patterns and trends supporting doctors in making more informed decisions. Banking and fraud detection. AI has matured significantly in the banking sector over the past half decade, from predicting potentially fraudulent transactions to determining loan eligibility based on various factors, AI plays a pivotal role in today's financial sector. Customer service and online support. Imagine a company like HP managing 70,000 plus help pages across 17 languages. AI steps in here, automating customer support and providing round the clock service, significantly reducing the cost and enhancing the efficiency. Education. In the realm of education, AI is enabling personalized learning experiences, adapting to each student's individual needs, and fostering a more effective learning process. Entertainment. Our virtual assistance like Siri, Cortana, Alexa and Google are all powered by AI. With their voice recognition capabilities, they're like having a personal secretary at your command and cybersecurity. In the digital domain, AI is our watchman. With its machine learning algorithms and vast data analysis, it detts anomalies and responds to threats, strengthening our cybersecurity measure. You can see, AI has inter twined itself into every facet of our lives, enhancing our capabilities and reshaping both our commercial and business landscapes. The possibilities are immense, and they continue to expand with each passing day. Now let's talk about the future of AI. When we think about the future of AI, it's really quite fascinating. We're on the verge of a time when self driving cars could be the norm. Imagine having robots at home helping us with tasks from making coffee to more complicated stuff. We're also looking at the rise of smart cities where AI runs everything from our phones to household appliances. Plus, robots are stepping up to do high risk jobs like bomb diffusel, for example. As AI continues to develop, it is likely to have an even greater impact on our world. So based on all of that, some possible applications of AI in the future may include automated transportation. Imagine a world where AI does all the driving for us. We're getting closer to a reality where self driving cars are a standard way to get around. This isn't just about cars, though. We're talking automated drones, delivering our packages, AI driven trains, ensuring precise, timely transport, and even autonomous boats and planes. All this aims to make our journey safer and more efficient by reducing human error. It's a massive shift that could redefine how we think about transportation. Personalized medicine, AI can be useful to analyze large amounts of medical data to identify patterns and trends that can help doctors diagnose and treat diseases more effectively. For example, AI can be used to develop personalized cancer treatment that are tailored to the specific genetic makeup of each patient. Virtual assistance. AI powered virtual assistance can help us with a variety of tasks such as scheduling appointments, making travel arrangements, and managing our finances. Virtual assistance can also provide us with information and entertainment, and they can even be used to control our smart home devices. And speaking of smart homes themselves, AI can be used to make our homes more comfortable, efficient, and secure. For example, AI can be used to control our thermostats, lights, and other appliances. And it can also be used to monitor our homes for security threats. And last but not least artificial general intelligence or AGI. AGI is a hypothetical type of AI that would be as intelligent as a human being. AGI could potentially solve some of the world's most pressing problems such as climate change and poverty. However, AGI also raises some ethical concerns, such as the potential for AI become self aware and to develop its own goals and desires. AI is a powerful technology with the potential to revolutionize many aspects of our lives. It is important to be aware of the potential benefits and risk of AI and to use it responsibly. I hope you enjoyed this brief explanation on AI. In the next video, we will have a more in depth look at machine learning. 3. L1V2 - Machine Learning: What is machine learning? In this video, we're going to cover the basics of machine learning. Specifically, we're going to start with a definition of what machine learning is. Then we provide a comparison between artificial intelligence, machine learning, and deep learning. Then we discuss how machine learning works. We talk about different types of machine learning, supervised learning, unsupervised learning, and reinforcement learning. We talk about the prerequisites for machine learning, and at the end, we provide some examples of applications of machine learning. So what is machine learning? Machine learning works on the development of computer programs that can access data and use it to automatically learn and improve from experience. This enables machine learning to help us do complex tasks, such as three D printing of entire houses. By using algorithms and large datasets, machine learning can automate design and planning, helping to address construction challenges such as structural integrity and material efficiency. Can also customize designs based on environmental conditions. It can help reduce cost and time while increasing precision and has the potential to transform the construction industry. As another example, consider our personal assistance like Siri or Google Assistant or Amazon Echo. They all use the power of machine learning to help us with our everyday tasks like playing our favorite music or ordering food or voice controlling our home appliances or requesting rights from Uber and much more. As we said before, artificial intelligence is a technique which enables machines to mimic human behavior. This is key because it's how we figure out if our calculations and work are on the right track by seeing if they can imitate human behavior. We're using this approach to take over some of the work humans do with the aim of making things more efficient, streamlined, and accurate. AI is a broad field that covers many different technologies. Some examples of artificial intelligence include IBM, Deep Blue chess, electronic game characters, and self driving cars. These are just a few examples of many ways that artificial intelligence is being used today. Machine learning is a technique that uses statistical methods to allow machines to learn from their past data. This means that machines can use past inputs and answers to help them make better guesses in future attempts. Google search algorithm and email spam filters are examples of applications of machine learning. And then we have deep learning, which is a subset of machine learning. Uses algorithms to allow models to train themselves and perform tasks. AlphaGo and natural speech recognition are two examples of deep learning. Deep learning is often associated with neural networks which are a type of black box model. As a black box model, it's difficult for humans to track how deep learning models make their predictions. However, deep learning models can still be very effective at performing tasks. We will have a deeper dive into the world of deep learning later on. Now let's see how machine learning works. To understand how machine learning works, let's have a look at the following diagram. In the first step, we start by training our data. Then we fed the trained data into a machine learning algorithm for processing. The process data goes through another machine learning algorithm. And now it's time to test our work. We bring in some new data and run it through the same algorithm. In the next step, we check the predictions and the results. If we have any reserve training data, now is the time to use them. In the next step, if the prediction doesn't look right, let's say it gets the thumbs down, it's time to circle back and retrain the algorithm. Remember, it's not always about getting the right answer right away. The goal is to keep trying for a better answer. You might find the initial result isn't what you wanted. That's okay. It's part of the process. And it can depend on the field you're working in, whether it's healthcare, economics, business, stock market, or something else. The results can be very different. So we need to try out the model, and if it's not giving us the result we need or if we think we can achieve better outcomes, we retrain our model. And in the final step, we keep refining and retraining until we get the best possible answer. That's how machine learning works. Now let's look at different types of machine learning. We can see that we have supervised, unsupervised, and reinforcement learning. We'll go into each one, get a good idea of when and where to use them and what they're all about. In machine learning, we use lots of different algorithms to deal with tough problems. Each one fits into a certain type. So we've got three main types of machine learning algorithms, supervised learning, unsupervised learning, and reinforcement learning. Now let's get into what each of these learning methods really mean. Supervised learning uses labeled data to train machine learning models. Labeled data means that the output is already known to us. The model just needs to map the inputs to the outputs. An example of supervised learning can be to train a machine that identifies the images of animals. Here, we can see a trained model that identifies the picture of a cat. Unsupervised learning uses unlabeled data to train machines. Unlabeled data means that there is no fixed output variable. The model learns from the data, discovers patterns and features in the data, and returns the output. In this example, our unsupervised model uses the images of vehicles to classify if it's a bus or a truck. So the model learns by identifying the parts of a vehicle, such as the length and width of the vehicle, the front and rear end covers, roof, hoods, the types of wheels used, and many other features. Based on these features, the model classifies if the vehicle is a bus or a truck. And we have reinforcement learning. Reinforcement learning trains the machine to take suitable actions and maximize reward in a particular situation. It uses an agent and an environment to produce actions and rewards. The agent has a start and an end state, but there might be different parts for reaching the end state like a maze. In this learning technique, there is no predefined target variable. An example of reinforcement learning is to train a machine that can identify the shape of an object, given a list of different objects such as square, triangle, rectangle, or a circle. In this example, the model tries to predict the shape of the object, which is a square. Now let's look at different machine learning algorithms that come under these learning techniques. Some of the commonly used supervised learning algorithms are polynomial regression, random forests, linear regression, logistic regression, chain nearest neighbors, naive base, support vector machines, and decision trees. And these are just some examples of algorithms used for supervised learning. There are so many other algorithms that are used in machine learning. For unsupervised learning, some of the widely used algorithms are K means clustering, singular value decomposition, fuzzy means, partial is squares, a priori, hierarchical clustering, principal component analysis, and DBS scan. Similarly, there are so many other algorithms that can be used for unsupervised learning. And some of the important reinforcement learning algorithms are Q learning, SARSA, Monte Carlo, and deep Q networks. So as we said, there are so many different algorithms available to us, and choosing the right algorithm depends on the type of problems we're trying to solve. Now let's look at the approach in which these machine learning techniques work. So supervised learning methods need external supervision to train machine learning models, and therefore, the name supervised comes. They need guidance and additional information to return the result. It takes labeled inputs and maps it to known outputs, which means you already know the target variable. Unsupervised learning techniques do not need any supervision to train any models. They learn on their own and predict the output. They find patterns and understand the trends in the data to discover the output. So the model tries to label the data based on the features of the input data. And similarly, reinforcement learning methods do not need any supervision to train machine learning models. Reinforcement learning follows trial and error method to get the desired solution. After accomplishing a task, the agent receives an award. An example could be to train a dog to catch the ball. If the dog learns to catch a ball, you'll give it a reward, like a treat. And with that, let's focus on applications and types of problems that can be solved using these three types of machine learning techniques. So supervised learning is generally used for classification and regression problems. For example, you can predict the weather for a particular day based on humidity, precipitation, wind speed, and pressure values. As in another example, you can use supervised learning algorithms to forecast sales for the next month or next quarter for different products. Similarly, you can use it for stock price analysis or identifying if a cancer cell is malignant or benign. Unsupervised learning is used for clustering and association problems. For example, it can do customer segmentation, which is segmenting and clustering similar customers into groups based on their behavior, likes, dislikes and interest. Another example for the applications of unsupervised learning is customer churn analysis, which is a process of evaluating and understanding why and when customers stop doing business with a company. With the aim of developing strategies to improve customer retention. And finally, we have reinforcement learning. Reinforcement learning is reward based. So for every task or every step completed correctly, there will be a reward received by the agent. And if the task is not achieved correctly, there will be some kind of penalty. Now let's look at some examples. Reinforcement learning algorithms are widely used in gaming industry to build games. It is also used to train robots to perform human tasks. Multipurpose AI chatbots like hat GPT or Google Bart use reinforcement learning to learn from the user input and adjust their output based on previous conversations. And with that, we have come to the end of this section on supervised versus unsupervised versus reinforcement learning. Now, let's see what are the prerequisites of machine learning. So the first one is computer science fundamentals and programming. Many machine learning applications today require a solid foundation in basic scripting or programming. It is not just about writing complex algorithms, but being able to understand and manipulate the underlying structures. Without a good grasp of these fundamental skills, it will be challenging to make the most of the machine learning tools available. So if you're seriously considering diving into machine learning, it's advisable to brush up your programming skills. Intermediate statistical knowledge. A fundamental understanding of probabilities is needed in the world of machine learning. You'll often find yourself asking questions like, if A is happening, what's the likelihood that B will occur? Or if there are clouds overhead, what are the chances it will rain? These type of questions, rooted in probability, are at the heart of many machine learning algorithms. It's all about predicting outcomes based on given conditions. So if you're keen to make significant strides in machine learning, it's definitely worth familiarizing yourself with the basics of a statistics and probability. Linear algebra and intermediate calculus. Linear algebra is key as it requires you to grasp the concept of plotting a line through your data points and understanding what it represents. This is the primary idea behind linear regression models where you draw a line through your data and use this line to compute new values. Regarding intermediate calculus, it involves having a basic understanding of differential equations. No need to be a master at it as the computer handles most of the heavy calculations, but it's beneficial to recognize the terminology when it pops up, especially if you're diving deeper into model programming. And data angling and cleaning. Perhaps one of the most significant aspects in this field is mastering the art of tidying up your data. It's often said if you input bad data, you'll get bad data out. But if you have good data in, it's more likely to have good data out. The quality of your data can greatly influence the outcome of your machine learning models. Therefore, understanding how to effectively clean and organize your data becomes a critical skill in ensuring the accuracy and reliability of your results. Now let's look at some examples of applications of machine learning. We have object detection and instance segmentation. Object detection and instance segmentation are two different but related tasks in machine learning. Object detection is about recognizing and finding items within a picture like telling different cats apart, for example. The other hand, instance segmentation is the next step that separates these identified objects from the rest of the image. These techniques are put to use in various ways, including identifying different elements in an image. Moreover, segmentation can further isolate or cut out specific components. One popular application of object detection and segmentation is the Google Pixel phones quick snapshot feature. This feature uses machine learning to identify objects in a user's current view and then overlay animated stickers or filters on top of those objects. This can be a fun and creative way to add a personal touch to photos. We also have license plate detection. This is a pretty cool use of machine learning. Imagine a car driving into view and the system is able to spot and identify the license plate on that car. This application of machine learning can be particularly useful in various situations like security checkpoints, parking lots, traffic control, or even for charging tolls without making the car stop in the middle of a highway. It showcases how machine learning can extract specific information from a larger context with accuracy. And we also have automatic translation. Automatic translation, powered by machine learning has been a game changer in breaking down language barriers. It is the driving force behind the instant translation that you see on foreign websites, making the content accessible in your preferred language with just a click. It is also the technology that enables tools like Google Lens to provide real time translations of signs when you point your camera at them. Whether it is browsing the Internet or navigating a foreign city, machine learning has revolutionized our ability to understand and interact with the world. Truly, automatic translation is an impressive testament to how machine learning can bring the world closer together. Thank you for watching this video. The world of machine learning applications is vast and ever evolving. It is one of the fastest growing sectors in technology, and the possibilities are endless. What I have shown you here are just a few of the highlights, a small sample of what is possible. But there is so much more to come. In the next video, we will shift years and explore deep learning, a soft field of machine learning that made many of the machine learning advances possible. See you in the next one. 4. L1V3 - Deep Learning: In this video, we'll talk about deep learning and neural networks. Have you ever wondered how Google can translate an entire webpage in no time from almost any language to another, or how Google Photos magically sorts your pictures based on the faces of people and pets it recognizes? Or how about when Google Lens fills you in on the details of a plant, object, or animal when you scan it with your phone? That's deep learning working its magic right there. In this video, let's try to answer the question of what is deep learning and how it makes all these incredible things possible. In this video, we'll be discussing the following topics. We'll start with an understanding of deep learning and then move on to artificial neural networks, which are a type of machine learning algorithm that are used in deep learning. We will then explore some of the practical uses of deep learning and introduce some of the most popular deep learning platforms. And finally, we'll discuss some of the limitations of deep learning and how quantum computers can tackle those limitations. It's going to be an exciting session on deep learning. So as we said earlier, deep learning is a subset of machine learning, and both are part of the bigger concept called artificial intelligence. Imagine artificial intelligence as the whole realm of making machines act like humans. Machine learning is a part of that realm, and it's all about giving machines the ability to learn and make decisions based on data, kind of like how we learn from experience. Now, deep learning is a more specific part of machine learning. It's like teaching a machine to think a bit like a human brain with a structure called an artificial neural network. Artificial neural networks or ANNs are a specific type of machine learning algorithm that try to loosely mimic the neural networks in the human brain. When we say deep learning, we're usually talking about using really big neural networks to train a model on loads of data. It's not different from machine learning, just a fancier term we use when things get pretty big scale. So what are artificial neural networks? Let's have a closer look at the construction of a neural network. Each layer consists of nodes or what we call neurons. The neurons in one layer connect with neurons in the next layer through channels. Each channel is assigned a weight, which plays a significant role in the network's learning. Every neuron has an associated bias and an activation function. The activation function is used to transform the weighted sum of the inputs and the bias into an output that is sent to the next layer. As we said before, ANNs sit at the core of deep learning. These algorithms are crafted in a way that mirrors the working of the human brain. They absorb data, learn to identify patterns in the data, and then make educated predictions for a new set of data. Let's explore the process by building a neural network capable of distinguishing between a cube and a pyramid. Consider an image of a cube as an example. This image is comprised of 28 by 28 pixels, resulting in a total of 784 pixels. Each pixel is then provided as input to individual neurons within the first layer. Neurons in one layer are connected to neurons in subsequent layers through channels. The inputs are multiplied by their corresponding weights, and then the bias will be added to it. This combined value then undergoes evaluation through a threshold function, known as the activation function. The result is transmitted as input to the neuron within the hidden layer. Then the output of the activation function determines whether a neuron becomes activated or not. Activated neurons transmit data to the neurons in the next layer via the channel. This iterative process known as forward propagation enables the data to propagate through the network. Within the output layer, the neuron with the highest value becomes activated and determines the final output. These values are essentially probabilities. In this particular scenario, the neuron associated with the pyramid has the highest probability, indicating that the neural network predicts the output as a pyramid. Well, obviously, our neural network has made an incorrect prediction. It's important to note that at this stage, our network has not undergone training yet. So let's have a look at the steps for training a neural network. During the training process, the network receives both the input and the expected output. By comparing the predicted output to the actual output, the network identifies the error in its prediction. The magnitude of the error indicates how wrong we are, and the sign suggests if our predicted values are higher or lower than expected. This information is then propagated backward through the network, a technique referred to as back propagation. Through back propagation, the network adjusts its internal parameters, such as the weights and biases to minimize the error and improve its future predictions. The iterative cycle of forward propagation and back propagation is repeated with multiple inputs during the training process. This cycle continues until the weights within the network are adjusted in a way that allows the network to accurately predict the shapes in the majority of cases. This marks the completion of our training process, where the network has learned to make correct predictions. Although training neural networks can be a time consuming process, sometimes taking hours or even months, the investment of time is justified given the immense possibilities they offer. The intricate nature of training involves fine tuning numerous parameters and optimizing the network's performance, requiring significant computational resources and patience. However, the benefits gained from a well trained neural network, such as improved accuracy, advanced pattern recognition, and sophisticated decision making outweighs the time spent in training. It's a reasonable trade off considering the remarkable potential and capabilities the neural networks bring to the table. Now let's have a look at some of the applications of deep learning. As we said earlier, it is the power of neural networks that makes deep learning possible. Let's explore some of the key applications where neural networks shine. One notable example is facial recognition technology on smartphones, which utilizes neural networks to estimate a person's age based on their facial features. By distinguishing the face from the background and analyzing lines and spots, these networks correlate the visual cues to approximate the person's age. Neural networks also play a crucial role in forecasting, enabling accurate predictions in various domains such as weather forecasting or stock price analysis. These networks perform very well in recognizing patterns, making them able to identify signals that indicate the likelihood of rainfall or fluctuations in stock prices. Neural networks can even compose original music. They can learn intricate patterns in music and refine their understanding to compose original melodies, showcasing their creative potential. And another area where NRA networks Excel is customer support. Many individuals engage in conversations with customer support agents without even realizing they are actually interacting with a bot. These sophisticated networks simulate realistic dialogue and provide assistance, enhancing customer service experience. Also in the field of medical care, neural networks have made significant strides. They have the ability to detect cancer cells and analyze MRI images, providing detailed and accurate results that aid in diagnosis and treatment decisions. And obviously, we also have self driving cars. Once only a possibility in science fiction, now they become a tangible reality. These autonomous vehicles rely on neural networks to perceive and interpret the environment, enabling them to navigate roads, make decisions, and ensure passenger safety. And with that, let's have a look at some popular deep learning frameworks. So some of these frameworks are tensor flow, Pytorch, cross, Deep learning four J, Cafe, and Microsoft Cognitive Toolkit. These frameworks have gained widespread recognition and play a significant role in advancing the field of deep learning. And now let's discuss some of the limitations of deep learning. While deep learning holds tremendous promise, it is important to acknowledge its limitations as well. So firstly, although deep learning is highly effective in handling unstructured data, but it needs a substantial amount of data for training purposes. The second problem is that even assuming that we have access to the required data, processing it can be challenging due to computational power. Training neural networks demands the use of graphical processing units or GPUs which have thousands of cores compared to central processing units or CPUs. And at the same time, GPUs are way more expensive than CPUs. And finally, training is time consuming. Deep neural networks can require hours or even months to train with the duration increasing as the volume of data and the number of network layers increase. That said, it is worth mentioning that quantum computers developed by companies such as Google and IBM offer a potential solution to overcome these limitations. Quantum computers have the ability to perform complex computations at an exponentially faster rate than classical computers. With their unique architecture and quantum processing units or QPs, they have the potential to significantly accelerate the training process of neural networks. In addition, quantum computers can handle larger scale datasets more effectively. Reducing the data requirements and mitigating the challenges associated with processing such vast amounts of information. While quantum computing is still in its earlier stages, ongoing research and development hold the promise of overcoming the limitations faced by traditional deep learning approaches. Thank you for exploring the world of deep learning with me. It is crucial to acknowledge that we are still in the earlier stages of exploring what deep learning and neural networks can do for us. However, big names like Google, IBM, and Nvidia have recognized this growth trajectory, investing in the development of libraries, predictive models, and powerful GPUs to support the implementation of neural networks. We are almost at the end of this section on traditional artificial intelligence. It's important to note that we have merely scratched the surface when it comes to the potential of deep learning and AI. Exciting possibilities lie ahead. As we push the boundaries of what is possible, the line between science fiction and reality becomes increasingly blurred. The future holds an overload of surprises, and deep learning is at the forefront of these groundbreaking advancements. In the next video, which is the last video of this section, we will learn the difference between discriminative and generative machine learning models, which will prepare us for the next section of this course on generative artificial intelligence. See you in the next one. 5. L1V4 - Discriminative vs Generative: In this video, we're going to talk about discriminative and generative algorithms. These are two important types of machine learning models. To make it easy for you to understand, we'll start with the story and then discuss how these two types of machine learning work in detail using popular algorithms as examples. So let's jump right in. All right. Let's dive into our story. Let's imagine we have two alien visitors who have never seen apples and bananas before. We want to observe how they learn to distinguish between these two fruits. The first alien decides to understand these fruits by drawing them. It carefully observes the shape, color, and texture of each fruit, and then recreates them on paper. This way, it creates a visual representation or a model of what each fruit looks like. Whenever it sees a new fruit, it refers to these drawings to identify that fruit. This is similar to what we call a generative algorithm in machine learning. The second alien, on the other hand, goes about it differently. Instead of drawing, it starts comparing the features of the fruits. It notices that apples are usually round and red while bananas are long and yellow. When it's given a new fruit, it doesn't look for a perfect match. Instead, it checks which fruits features are closer to the new fruit and guess that it is the same one. This approach is more like what we call a discriminative algorithm in machine learning. So that's the basic idea. These two different approaches will help us understand discriminative and generative algorithms. Moving on, let's formally define our two types of algorithms based on our aliens approach. The first aliens method is a prime example of what we call generative classification. This is where a model learns to generate a representation of each class. It's like learning how an apple or a banana looks and using that knowledge to identify the future instance. In contrast, the second aliens method represents discriminative classification. This model learns to distinguish between classes based on their features. Instead of learning what an apple or banana looks like, it learns the differences between them. It then uses these differences to decide what a new fruit might be. Each approach has its own strengths and weaknesses, and they're used in different scenarios. Now that we've introduced the concepts, let's explore them in more depth. To better grasp these concepts, we're going to discuss specific algorithms that employ these two types of classifications. For discriminative classification, we're looking at logistic regression as an example. And for generative classification, our example will be the naive base algorithm. So in the realm of discriminative classification, logistic regression creates a decision boundary based on the features of the input. For our fruit example, these features could be color, length, or weight. The algorithm learns patterns from these features and then uses them to classify new fruits. Conversely, the naive base algorithm, which is a generative classification model, tries to understand the distribution of each class in the feature space. Rather than just identifying the differences between classes, it learns how each class distributes in the data. Now, let's go deeper and understand how these algorithms use the strategies to classify new data. With the logistic regression model, we're dealing with features like color, length, and weight of the fruits. The model uses these features to learn patterns and make decisions. For example, it could learn that if a fruit has a yellow color and is longer than 5 ", there's a high chance it's a banana. This method of creating a decision boundary based on features of examples is the essence of discriminative learning. On the flip side, generative learning used by the naive base model attempts to understand the distribution of each class in a multidimensional plane like a three dimensional space for our three fruit features. The model tries to visualize where apples and bananas are likely to appear in this space based on their color, shape, and weight. Now, let's think about some important questions about generative and discriminative models. Questions like which model needs more data for training? Which one gets affected by missing data? Which model gets impacted by outliers, which requires more math, and which one tends to overfit. It's important to think about these questions because they affect how you might choose to use these models. For example, a generative model doesn't need a lot of data because it's just trying to understand the basic characteristics of each class. However, a discriminative model needs more data because it's trying to learn the intricate differences between the classes. Thinking about these questions can help you understand these models more deeply and use them more effectively. But don't worry if you're not sure about the answers. We're going to discuss them in detail. So let's look at the first question. Which model needs more data for training? Discriminative models like logistic regression, generally need more data for training. They learn by identifying differences between classes. So they need a rich and diverse set of examples to do that effectively. The second question is, which one gets affected by missing data? The fact is that both types of models can be affected by missing data. But generative models might be more sensitive because they're trying to capture the overall distribution of the data. Any missing information could skew their understanding of that distribution. Next question is, which model gets impacted by outliers? Again, outliers can affect both models, but discriminative models might be more susceptible. These models focus on boundaries between classes, and an outlayer could significantly shift those boundaries. Next question is, which requires more math? In terms of mathematics, generative models like naive base often require more calculations because they involve estimating the distribution of data, which can be computationally intensive. And the last question is, which one tends to overfit? Overfitting can occur in both models, but discriminative models are generally more prone to it. This is because they can become too tuned to the training data, learning even it's noise and errors. And now that we know the difference between discriminative and generative machine learning algorithms, let's break down some common examples of each type. So some of the discriminative algorithms are logistic regression, support vector machines, decision trees, random forests, and gradient boosting machines. Some of the generative algorithms are naive base, Gaussian mixture models, hidden Markov models, latent directt allocation, and generative adversarial networks. So to wrap it up, in this video, we've unpacked the word of discriminative and generative models using a simple and engaging story. We've seen how logistic regression in a discriminative model uses distinct features to create decision boundaries, while the generative model, naive base tries to understand the overall distribution of the data. Understanding the difference between discriminative and generative models give us valuable insight into how generative AI systems operate. Generative models like the ones used in generative AI, learn the underlying distribution of the training data. This knowledge is then used to generate new data that mirrors the training data. This is why generative AI is so powerful. It can generate new realistic outputs such as images, text, and even music because it understands the world of its training data. In contrast, discriminative models simply learn the boundaries between classes and are primarily used for classification tasks. They can't generate new data because they don't try to understand the underlying distribution of the data, the differences between classes. By understanding these differences, you can better appreciate the capability and flexibility of generative AI systems. This comprehension could guide you when deciding on which type of AI system would be best suited to your particular project or use case. And by that, we reached the end of Section one of this course, traditional Artificial Intelligence. I will see you in Section two, where we discuss generative artificial intelligence. 6. L2V1 - Transformers: Now let's discuss transformers and their leading role in powering generative artificial intelligence. Transformers are a type of neural network that are able to learn long range dependencies in sequences. This makes them well suited for tasks such as text generation, where the model needs to understand the context of the previous words in order to generate the next word. Transformers produced a 2018 revolution in natural language processing. Now let's see how transformers work. Transformers are made up of two main parts, an encoder and a decoder. Encoders are responsible for taking an input sequence and converting it into a sequence of hidden states. The encoder is made up of a stack of self attention layers. Self attention is a mechanism that allows the encoder to attend to different parts of the input sequence when generating the hidden states. This allows the encoder to learn long range dependencies in the input sequence, which is essential for tasks such as text generation. Decoders are responsible for taking a sequence of hidden states and generating an output sequence. The decoder is also made up of a stack of self attention layers. However, the decoder also has a special attention layer that allows it to attend to the input sequence when generating the output sequence. This allows the decoder to learn how to generate output that is consistent with the input sequence. So the encoder and decoder work together to generate an output sequence. The encoder first converts the input sequence into a sequence of hidden states. The decoder then takes these hidden states and generates an output sequence. The decoders attention layer allows it to attend to the input sequence when generating the output. This allows the decoder to learn how to generate output that is consistent with the input sequence. There are several benefits to using transformers for generative AI. First, transformers are able to learn long range dependencies in sequences. This allows them to generate more realistic and coherent output. Second, transformers are able to be trained on very large datasets. This allows them to learn more complex patterns and relationships in the data. And third, multiple parallel transformers are able to work together. This allows them to be trained more quickly and efficiently. As a result of these benefits, transformers have become the state of the art approach for a wide variety of generative AI tasks, such as text generation, image generation, and music generation. Something to be aware of when using transformers is that it is possible for them to create hallucinations. In transformers, hallucinations are words or phrases that are generated by the model that are often nonsensical or grammatically incorrect. But why hallucinations happen? Hallucinations can be caused by a number of factors, including the model is not trained on enough data, or the model is trained on noisy or dirty data, or the model is not given enough context, or the model is not given enough constraints. Hallucinations can be a problem for transformers because they can make the output text difficult to understand. They can also make the model more likely to generate incorrect or misleading information. So how can we mitigate hallucinations? There are a number of ways to mitigate hallucinations in transformers. One way is to train the model on more data. Another way is to use a technique called beam search, which allows the model to explore a wider range of possible outputs. And finally, it is important to give the model enough context and constraints so that it does not generate nonsensical or grammatically incorrect output. Here are some examples of hallucinations that have been generated by transformers. The cat sat on the mat and the dog ate the moon. The boy went to the store and bought a gallon of air. The woman drove to the bank and withdrew $1 million. As you can see, these examples are all nonsensical or grammatically incorrect. This is because the transformers have generated these words or phrases without any context or constraints. It is important to note that hallucinations are not always a bad thing. In some cases, they can be used to generate creative and interesting texts. However, it is important to be aware of the potential for hallucinations when using transformers and to take steps to mitigate them. Transformers are being used to generate a wide variety of creative content, including text, image, music, and even video. Some of the most common applications of transformers in generative AI include text generation. Transformers can be used to generate texts such as news articles, blog posts, and creative writing. For example, the transformer model, GPT three has been used to generate realistic looking fake news articles, and it can even write poetry and stories. Image generation. Transformers can be used to generate images such as paintings, photographs, and digital art. For example, the transformer model, Imagine has been used to generate realistic looking images of people, animals, and objects. Music generation. Transformers can be used to generate music, such as songs, melodies, and beats. As an instance, the transformer model Mus Net has been able to generate original music that sounds like it was composed by a human musician. And we have video generation. Transformers can be used to generate video, such as movies, TV shows and animated cartoons. For example, the transformer model Deep Mind video has been used to generate realistic looking video that looks like it was filmed by a human camera operator. As the technology continues to develop, we can expect to see even more amazing applications of transformers in generative AI. Transformers have the potential to revolutionize the way we create and consume content, and they are already being used to create some truly incredible things. I 7. L2V2 - Gen AI: Welcome to generative Artificial Intelligence. We start this video by explaining how to distinguish between generative AI and traditional machine learning. Then we provide a formal definition for generative artificial intelligence and end the video with some examples of generative AI. Here, we're showing two key approaches in artificial intelligence, traditional machine learning and generative artificial intelligence. The top image shows traditional machine learning. Here, the model learns from data with labels attached to it. What it does is it figures out the link between the features of the data and their corresponding labels. This understanding is then used to make educated guesses on new data it hasn't seen before. Now the bottom part of the image shows something a bit different. The generative AI model. Instead of just figuring out the relationship between inputs and outputs, it digs deeper. It focuses on the complex pattern in the content. This understanding of pattern is what gives it the power to create new and realistic content on its own. This could be anything, a poem, a news article, a picture, or even a music composition. So you see generative AI brings a new creative angle to the huge world of AI. The nature of the output plays a crucial role in differentiating between generative AI and other models. Traditional models typically produce categorical or numerical outputs, such as whether an email is a spam or not or predicting sales figures. On the other hand, generative AI can produce outputs like written or spoken language, images, or even audio, reflecting its ability to generate content that mimics reality. We can imagine it like this mathematically. If this equation isn't something you've seen recently, here's a quick reminder. The equation Y equals F of X computes the outcome based on varying inputs. Y symbolizes the result from the model. F stands for the function we use in the calculation. And what about X? That represents the inputs or inputs used in the equation. So in simple terms, the models output is a function of all the inputs. The key here is understanding the nature of the output Y as a function of inputs X. Traditional models generally produce numerical outcomes. While generative AI models can map those numerical values to different forms of information, making them able to generate complex responses such as natural language sentences or images and videos. To summarize at a high level, the traditional classical supervised and unsupervised learning processes, take training code and label data to build a model. Depending on the use case or problem, the model can give you a prediction. It can classify something or cluster something. The distinction lies in the application. Traditional models make predictions, classify or cluster data, while generative AI models are more versatile, creating a wide range of content. The generative AI method can work with training code, labeled data, and unlabeled data of all kinds to construct what we call a foundation model. This foundation model can then produce new content, such as text, code, images, audio, video, and so on. Generative AI's power lies in its ability to ingest diverse data types, including unlabeled data to build models that generate fresh content, which extends beyond traditional models capabilities. We've come a long way moving from traditional programming to neural networks and now to generative models. In the old days of traditional programming, we had to manually input the rules to differentiate the cat. We had to embed specific rules into the program. It was something like if it's an animal with four legs, two ears, fairy, and shows a liking for yarn and catnip, then it's probably a cat. And we had to write all of that in a programming language and not natural language. In the way of neural networks, we could show the network images of cats and dogs, then ask, is this a cat? The network would likely respond with a prediction, it is a cat. So we can see that neural networks allow for more nuanced decision making by training on examples, which is an evolution from hard coding rules. In the generative wave, we can produce our own content, such as text, images, audio, video, et cetera. Models like Palm or Pathways Language Model, Lambda language model for dialog applications, and GPT, generative pre trained transformer, consume vast amounts of data from diverse sources, including the Internet to construct foundation language models, which can be utilized simply by asking a question, whether typing it into a prompt or verbally talking into the prompt itself. So if we ask what's a CAT, it can give us everything it has learned about the cat. Generative AI boosts user interaction, turning users from mere spectators to active creators. Models like Palm, Lambda, and GPT stand out. They're trained on large datasets and provide smart context aware answers. This focus on the user makes generative AI attractive for a range of different applications. Now let's provide our formal definition. What is generative AI? Generative AI is a type of artificial intelligence that creates new content based on what it has learned from existing content. The process of learning from existing content is called training and results in the creation of a statistical model when given a prompt. AI uses the model to predict what anepected response might be and this generates new content. The emphasis here is on the inheritability of generative AI to learn and create. Unlike traditional models, which predict based on pre established relationships, generative AI focuses on understanding the underlying structure of the input data. After training, the model can generate unique responses or content, which significantly broadens the applications and capabilities of AI systems. Essentially, it learns the underlying structure of the data and can then generate new samples that are similar to the data it was trained on. So let's see what is the difference between language models and image models. Generative language models learn about patterns in language through training data. Then given some texts, they predict what comes next. Generative image models produce new images using techniques like diffusion. Then given a prompt or related imagery, they transform random noise into images or generate images from prompts. Let's dig a little deeper into each of them. As previously mentioned, generative language models focus on grasping the inherent structure of pattern within the data. They then leverage these learned patterns to generate novel responses or content. Which often closely resemble the original data. These characteristics make large language models an exceptional example of generative AIS potential. A generative language model takes text as input and can output more text and image, audio, or decisions. For example, under the output text, question answering is generated, and under output image, video is generated. So large language models are a type of generative AI because they generate novel combinations of texts in the form of natural sounding language. We also have generative image models, which take an image as input and can output text, another image or video. For example, under the output text, you can get visual question answering, which is a task in computer vision that involves answering questions about an image, while under output image, an image completion is generated. And on their output video, animation is generated. Like we mentioned before, generative language models learn about patterns and language structures through their training data. And then when given some text, they try to predict what comes next. So in a sense, generative language models can be seen as pattern matching systems, honing their ability to discern patterns from the data presented to them. Now that we provided the formal definition of generative artificial intelligence, let's end this video with some examples of generative AI. Here is an example of Google search auto complete feature. Based on things it learned from its training data, it offers predictions of how to complete this sentence. Cats hate, and some of the suggestions are cats hate the smell of cats hate water, cats hate cucumbers. Here is the same example using bar, which is a language model that is trained on a massive amount of text data and is able to communicate and generate human like text in response to a wide range of prompts and questions. So when I use the prompt, cats hate it answers. Cats hate a lot of things, but some of the most commons include, and then a list of things that it thinks cats would hate. And the same prompt using GPT four produces this response. Cats hate. Cats can express dislike or discomfort in response to a variety of situations, objects, or behaviors. Below are some of the things that cats typically dislike and list some of the things that it thinks cats would hate. Similar to Bart, GPT four is also a language model trained on a massive amount of text data and is able to communicate and generate human like text in response to a wide range of prompts and questions. Now let's look at some examples of image generation. We use the same prompt on three different AI tools. The prompt is a cat surrounded by things cats hate. If we try this prompt on DALE, which is an AI image generator that is built by OpenAI, the same company that build GPT. We get this result. We can also try it on Adobe Firefly Tik to image generator. And these are some of the results that we get from Firefly. We can also try Canva text to image application, and it provides some examples of what it thinks is appropriate in response to our prompts. Please keep in mind that here we used a very minimalistic prompt, just to show that even without providing much context, we can still produce results that are more or less relevant. We would get a much better result if our prompt included more detail and followed a solid structure. This points out to the importance of prompt design and prompt engineering, which we cover later on in this section. In the next video, we will talk about transformers, a technology that made all of this possible. See you in the next one. 8. L2V3 - Gen AI Applications: Let's have a look at the type of tasks that different AI models can perform. This task can generally be classified based on the type of input data they accept and the type of output data they generate. Here are some examples. Text to text. This is typically used in machine translation, text summarization, and chatbots like Bard and chat GPT. For example, if you ask GPT four, what is a cat? I would tell you, a cat is a small carnivorous mammal that is often kept as a pet. The term generally refers to, and then it keeps generating more information related to cats. Another model is text to image. This is used to generate images from text descriptions. An example of this would be Canvas text to image application, which creates images from text inputs. In this example, we can use the prompt a gray and white cat sitting on a window sill watching pigeons outside. And it will generate this image for us. Text to video. AI can also be used to generate videos from text descriptions, although this is a more complex and less explored task than text to image generation. For example, using tools like in video, we can create a video with just a text prom. The AI model uses our prompt to write a script for the video and then picks images and video clips that are relevant to the content of the script. It can even apply filters and transitions to the video. Some of these tools can also pick a music that is relevant to the content of the video. For example, the AI model can associate pets with playfulness and then pick a playful music to add to the video. Text to three D, these models generate three dimensional objects that correspond to a user's text description. For example, if you ask Shape E, a conditional generative model for three D assets to make an airplane that looks like a banana. It creates the three D object you can see here. We also have text to code. These models are able to use natural language description and then create a code based on that. It is useful for tasks like automated code generation, error detection, and code translation. For example, hat GPT, Bart and Github copilot share the capability to generate code. Models such as Chat GPT, due to their training on a wide range of Internet texts, including code, have the ability to generate code when provided with a suitable prom. Bart two works along the same lines, but its training is specifically focused on programming related text. We also have Github copilot that uses OpenAI's Codex model, which is trained on publicly available code, enabling it to suggest code completion and generate code from comments or function signatures. Takes to task. Takes to task models are trained to perform a defined task or action based on text input. This task can be a wide range of actions such as answering a question, performing a search, making a prediction, or taking some sort of action. For example, a Ts to task model could be trained to navigate a web UI or make changes to a doc through the GUI. We also have image to text. This is used in tasks like image captioning where the AI describes an image in words. For example, BliP which is an AI model capable of both captioning and generating images, can take an image as input and provide a description of that image in text as output. We also have image to image. These models perform tasks like image translation, for example, converting day images to night, colorizing black and white images, or enhancing image resolution. For example, Night Cafe is an AI image generator that can take an image as input to initialize the image creation process. It then produces a stylized image as output based on user's prompt and other settings that can be adjusted. Depending on the chosen algorithm, artistic or coherent, the start image serves different purposes. For the artistic algorithm, the shapes and structures in the image are more significant. But in the coherent algorithm, more attention is paid to resolution, colors and textures. As another example, let's look at Canva's magic edit feature. Using this feature, we can select a specific part of the image and replace it with a different image. For example, here, I select the basket and then tell the model that I want to replace it with a mountain range, hoping that the result will make it look like the cat is sitting on a rock. And we can see that the model creates some suggestions for me. If I'm not happy with the result, I can ask the model to regenerate the results until I find something I like. And here is the final result. I can even go further and select the ceiling in the background and ask the model to replace it with the sky with clouds. And this is how the result looks like. Video to text. This involves generating a text description or a transcription from a video. For example, RS AI can create transcripts or captions from a video input, even translating them from one language to another. While what we see in this example is captioning only the speech element of the video, some of the more advanced captioning tools are able to caption non speech elements as well. Non speech elements can include sound effects, for example, a bee buzzing, cheese jangling or doorbell ringing, music, either in the background or as part of a scene, audience reactions, for example, laughing, groaning, or booing, manner of speaking, for example, whispering, shouting, emphasizing a word, or talking with an accent and speaker identification for an off screen narrator or speaker or for multiple speakers. Similarly, we have audio to text. This is typically used in speech recognition systems to transcribe spoken language into written text. Whisper by Open AI is an example of an audio to text model. AI can also turn text into synthesized speech. Text to Speech systems convert text into a spoken language. Many AI tools such as play dot HT, can take a text input and generate human like speech with synthesized voices that can resemble different genders, ages, accents, or even different tones, such as happy, sad, angry, et cetera. Image to video. This task involves generating a sequence of images or a video from a single or a set of images. For example, cinematic photos, a feature in Google Photos, utilize machine learning to estimate an image's depth and construct a three D representation of the scene, regardless of whether the original image contains depth information from the camera. Following this estimation, the system animates a virtual camera to create a smooth panning effect similar to a cinematic sequence. This intricate process uses artificial intelligence to transform a static image into a dynamic three dimensional scene, giving it a video like quality. What we discussed so far are just a few examples, and the list is continually growing as the field of AI progresses and as researchers invent new applications for these technologies. Furthermore, in many real world applications, these tasks are combined. For example, an AI system may need to convert speech to text using a speech to text model, and then process the text using a text to text model, and after that, generate an appropriate spoken language response, which uses a text to audio model. But wait, there's more. There are many other fields that GNAI can have groundbreaking applications in. For example, just consider the word of music. Music related tasks are an active area of research and development in AI. Here are some common tasks. We have text to music. These models can generate music based on text inputs. For example, they can create a melody or a composition described by a phrase or a piece of text. Music to text. On the other hand, AI can also convert music to text, such as creating sheet music for a song or generating descriptive or emotional text based on a piece of music. We have audio to audio, which can convert one type of sound or music into another, like changing the genre of a song, turning a humming into a composed piece, or even removing vocals from tracks. There's also music recommendation. AI is heavily used in recommending music based on users listening habits, preferences, and even mood. We also have music generation. Mus Net by Open AI is one example of music generation. Models like OpenAI's Mus enet can generate four minute musical compositions with ten different instruments and can combine styles from country to mozart to the beatles. There's also music enhancement. For example, audio Studio can be used to enhance or alter existing music. It can do it by things like upscaling audio quality, changing the tempo or adding effects. And there is also music source separation. We can also use AI models to separate individual instruments, vocals, or other components from a mixed or master track. Metas DMax is an example of a music source separation tool. So in conclusion, the realm of generative AI is diverse, fascinating, and full of potential. The range of tasks it can perform from takes to take, takes to image, audio to take, and even intricate tasks like music enhancement and generation is truly remarkable. It's a field that's continuously evolving, pushing the boundaries of what we thought was possible. As we continue to explore and innovate, the list of applications is only going to expand. Generative AI holds the key to many breakthroughs and advancements that could revolutionize numbers of sectors and the way we interact with technology. As we continue to explore this exciting era of AI, who knows what astonishing possibilities we might uncover. The key is to stay curious, keep exploring and start imagining a future in which we can interact with AI systems in a trustworthy, responsible and ethical manner. 9. L2V4 - Prompt Engineering: Let's talk about an intriguing topic in the realm of generative AI, prompt engineering. As the name suggests, generative AI is all about systems that generate output, whether it's text, images, or any other type of content. As you will see in the next section of this course, large language models or LLMs, which are the powerhouse of generative AI are designed to generate human like text based on input prompts. In addition to generating human like texts, LLMs also help translate our prompts into outputs of other type of content, such as images and videos. This means that the better our input prompts are, the likelier we are to achieve higher quality output from any generative AI tool. Today, we'll be exploring several key aspects related to these input prompts. We'll clarify what exactly a prompt is and its role in shaping the model's output. We'll distinguish between prompt design and prompt engineering and then move on to introduce various methods of prompt engineering. And lastly, we'll discuss the limitations of prompt engineering to give you a realistic understanding and expectation of this exciting process. So let's get started. So what is a prompt? A prompt is essentially a piece of text that is given to a generative AI model as input. But it's not just any takes. It serves a fundamental purpose. These prompts are your communication link to the model. They direct the AI model and steer its output generation. The model takes in your prompt, processes it, and delivers an output that aligns with the prompts instruction. In other words, these prompts are your tool for controlling the model's output. Think of a prompt as your guiding instruction to the generative AI model, like a director guiding an actor. The more precise and clear your direction, the better the performance you can expect from the actor. Similarly, well designed prompts enable the model to produce higher quality and more specific outputs. Remember, the key lies in the quality and design of your prompts, and this is where the concepts of prompt design and prompt engineering come into play, which we are going to discuss now. As we mentioned earlier, the quality of the prompt plays a crucial role in determining the quality of the output from a generative AI model. Here, two concepts come into the picture prompt design and prompt engineering. Prompt design refers to the crafting of prompts that are specific to the task that the model is asked to perform. For instance, if you want the model to translate a piece of text from English to French, the prompt would be written in English and specify that the desired output should be in French. In essence, it's all about creating prompts that will generate the desired output. On the other hand, we have prompt engineering. This process is more about enhancing the performance of the model. It involves strategies like leveraging domain specific knowledge, providing examples of the desired output, or incorporating keywords known to be effective for any particular generative AI model. So you see, while both concepts revolve around crafting prompts, they serve different purposes. Prompt design is about tailoring prompts to tasks while prompt engineering aims to boost performance. However, they aren't mutually exclusive. In practice, creating an effective prompt often involves both designing it for the task and engineering it for better performance. Now let's look at some of the techniques employed in prompt engineering to maximize the output quality of our generative AI models. One such method is using domain specific knowledge. When you know the task area well, you can leverage that expertise to design prompts that guide the model more effectively. For instance, if you're working in medical AI, you might use medical terminology and structures in your prompts to increase accuracy. Another method is to use keywords known to be effective for a specific model. Just as in search engine optimization, where specific keywords help rank pages higher, certain keywords can direct the model more effectively. The choice of keywords would be based on the model's training data and its learned patterns. With models like Bart or chat GPT, you can directly ask the model about these keywords and how to use them to optimize your prompt. We should also consider advanced strategies such as role prompting, shot prompting, and chain of thought prompting. Role prompting is a technique where we instruct the GNAI model to take on a certain role or persona while generating its output. For instance, you could instruct the model to respond as if it's a historian explaining the causes of the first World War. The model then uses its training data to generate a response that aligns with this person. Shot prompting, on the other hand, involves giving a shot of context before the actual instruction. You can provide examples of the desired output. This helps guide the model by providing it with a reference or blueprint of what's expected. For example, if you want a summary of a document, you might provide a few examples of summaries along with the original text. Or if you're looking for a review of a film, instead of simply saying, write a review of the film X, you could say, Imagine you have just finished watching the thrilling film X in a crowded cinema. Write a review of the film. This added context can guide the model to produce more emotionally charged and context aware output. There are different types of shot prompting, zero, one, and few shot prompting. Zero shot prompting tasks the Gen AI model without prior examples. For example, translate this English sentence into French. The chat is sweeping. Here, we are providing a task without a specific example of how it should be done. One shot prompting provides a single example for guidance. For example, continue the following story. Once upon a time in a land far away, there was a brave night. And then we provide an example of a story continuation. The model then tries to continue the story following the style of the example we provided. And we have few shot prompting, which is also known as multi shot prompting. Here, we provide multiple examples to assist the model. An example would be asking the model to generate a product review, preceded by a series of example product reviews. The model will try to write a review similar to the examples we provided. And last but not least, chain of thought prompting involves providing a line of reasoning or argument to the GNAI model. Instead of a direct question or instruction, you give a series of thoughts that lead to the question. For example, instead of asking, what are the causes of global warming, you would prompt with we've seen an increase in global temperatures over the last few decades. This change often referred to as global warming seems to be influenced by various factors. What are these causes? These strategies can enhance the richness and relevance of the model's output, further demonstrating the power of skillful prompt engineering. Remember, these are not standalone methods, but can often be combined to engineer a powerful prompt. Now that we have an understanding of these techniques, let's move on to the limitations of prompt engineering. While prompt engineering opens up exciting opportunities to fine tune the output of a generative AI model, it's important to bear in mind that it's not a magic wand that can always guarantee perfect results. There are certain limitations and constraints we need to be aware of. First, generative AI models, although powerful, are not omnipotent. They're trained on a diverse range of data, but this doesn't mean they have the ability to accurately answer any question or fulfill any task you prompt. For example, the model doesn't have the capability to generate content outside its training cutoff date or accurately predict future events. Second, the accuracy and relevance of the model's output highly depends on the quality and clarity of your prompt. However, even a perfectly crafted prompt may not always produce the expected result due to the inherent unpredictability of the AI models. Third, even with meticulous prompt engineering, models may sometimes generate outputs that are factually incorrect or nonsensical. This is because these models generate responses based on patterns they learned during training, and they don't understand the content in the human sense. And lastly, certain tasks might require a level of specification or domain specific knowledge that surpasses the model's training. A general purpose generative AI model may not be able to accurately generate highly specialized content or response to highly technical prompts in fields like law, advanced mathematics, or specific medical sub disciplines. So while prompt engineering is a powerful tool, it is essential to be aware of these limitations to maintain realistic expectations and use generative AI models more effectively. Alright, let's wrap things up. Today, we've explored the world of prompts in generative AI models. We've learned about prompt design and engineering and discussed various methods like shot prompting, roll prompting, and chain of thought prompting. Remember, prompt engineering is not a magic bullet. It's a tool, and like any tool, it has its limitations. So experiment with creating your own prompts and explore the possibilities. Thanks for watching, and I'll see you in the next one. 10. L3V1 - LLMs: Welcome to introduction to large language models. Large language models or LLMs for short, are a subset of deep learning. They intersect with generative AI, which is also a part of deep learning. We already explained that generative AI is a type of artificial intelligence that can produce new content, including text, images, audio, and synthetic data. But what are large language models? When we use the term large language models, we refer to large general purpose language models that we can pre train and then fine tune to meet our needs for specific purposes. But what do we mean by pre trained and fine tuned? Think about the process of training a dog. Typically, you instruct your dog on basic commands like sit, calm down, and stay. These commands are usually enough for day to day life, assisting your dog in becoming a well behaved dog in the neighborhood. However, when you require a dog to fulfill a special role, such as a security dog, a guide dog, or a police dog, additional specific training becomes necessary. The same principle applies to large language models, just like the specialized training prepares dogs for their unique roles, fine tuning a pre trained large language model enables it to perform specific tasks efficiently and accurately, whether it's sentiment analysis or machine translation. The model can be honed to Excel in the desired domain. These models undergo training with a broad focus, preparing them to address the standard language related tasks like text classification, a widely used natural language processing task which involves categorizing text into organized groups based on its content. Question answering, which is a significant task in natural language processing, where the model is trained to understand and respond to inquiries accurately, essentially simulating the human ability to comprehend and answer questions. Document summarization, where the model is tasked with producing a concise and fluid summary of a large text, maintaining the essence and primary ideas and text generation across multiple industries, creating human like text, which can be tailored to a specific industries, whether it's drafting emails in corporate communication, creating product description in ecommerce, or generating patient reports in healthcare. These models possess the capability to be fine tuned in order to solve unique challenges within various sectors, including retail, finance, and entertainment, utilizing comparatively smaller field specific datasets. For example, in retail, they can be used for personalized product recommendations based on text data. While in finance, they can aid in predicting market trends from financial reports. Or in the entertainment industry, they might assist in script generation or content recommendation, showcasing the flexibility and wide applicability of large language models. Let's further break down the concept into three major features of large language models. Large language models are large, general purpose, and pre trained and fine tuned. Let's discuss each of them separately. The term large refers to two things. Firstly, it points out to the massive size of the training dataset, sometimes reaching the scale of petabytes. Secondly, it points to the immense number of parameters involved. In the realm of machine learning, these parameters are often called hyperparameters. Essentially, these parameters act as the memory and knowledge that the machine gains during model training. They often outline the proficiency of a model in addressing a task such as predicting text. By adjusting these parameters, we can fine tune the model's performance for more precise prediction. General purpose means that the models are powerful enough to solve commonplace everyday problems. This concept is led by two reasons. Firstly, human language exhibits a universal nature, irrespective of the distinct task it's applied to. Secondly, we have to consider resource limitations. Only a limited number of organizations have the ability to train these massive language models, which require extensive datasets and a massive amount of parameters. So why not let these organizations construct foundational language models that others can use. This brings us to the final aspect of large language models, pre training and fine tuning. Essentially, this means that a large language model is first pre trained for broad use cases, using an extensive dataset, collecting a wide range of linguistic patterns and knowledge. Following this pre training stage, the model is then fine tuned to cater to particular goals using a relatively smaller, more specialized dataset. This two step process ensures the model maintains a broad base of understanding while also being able to deeply understand and generate predictions specific to a given field or task. And with that, we wrap up our introduction to large language models. In the next video, we will discuss some of the benefits of using LLMs. 11. L3V2 - LMM Benefits: In this video, we're going to explore the various benefits of using large language models or LLMs for short. We'll see how these impressive AI models can be used for a variety of tasks, how they function with minimal field training data, and how they continue to improve as more data and parameters are added. We'll also discuss how LLMs adapt to different learning scenarios, even with the presence of minimal prior data. So hopefully by discussing these benefits, we can see why LLMs are a major leap forward in the realm of artificial intelligence. There are many clear and impactful benefits in employing LLMs. They aren't restricted to a single task. One model by itself is a multitasking powerhouse fulfilling many different roles. These sophisticated LLMs, which are trained on an enormous volume of data and develop billions of parameters have the capability to handle a variety of tasks. For instance, they excel at answering questions. LLMs can sift through their extensive training data to find the most appropriate and accurate answers to a wide range of queries. They are capable of understanding context, ambiguity, and even nonces in language, making them highly effective at question answering tasks. The model generates a response matching the queries context, tone and complexity, providing precise and contextually appropriate answers. In terms of text generation, LLMs truly shine. They can create high quality text that is coherent, contextually appropriate, and remarkably human like. Whether it's generating a piece of news, writing a poem, or even coming up with an engaging story, LLMs are highly capable. They can also assist with tasks such as content creation, writing assistance, and even draft completion. By considering the given input and using their vast knowledge base, they can generate text that is not only grammatically correct, but also rich in content meeting the demands of various use cases. Large language models are also highly capable in language translation. Equipped with the knowledge of numerous languages from their extensive training data, they can accurately translate texts from one language to another, maintaining the semantic meaning and context of the original text. Not only do they work with commonly spoken languages, but LLMs can also handle less widely used ones, making them an invaluable tool for cross cultural communication. In addition, they can comprehend and adapt to different dialects, slang and informal language, ensuring the translations are accurate, readable, and natural sound. LLMs are also a powerful tool when it comes to brainstorming. They can generate ideas, suggest alternative perspectives, and contribute to creative problem solving. Whether you're looking for a catchy headline, a unique marketing strategy or a fresh plot for a novel, these models can generate numerous possibilities based on the context you provide them. By training on a vast array of data, they have learned to come up with diverse and innovative ideas which can spark further inspiration and help move your project forward. Not only that, but they can also offer critique and suggestions for improvement on existing ideas, acting as an artificial brainstorming partner available at any time. And there's much more. Beyond the task we've discussed so far, LLMs have an abundance of other capabilities. For instance, they can be used in sentiment analysis, determining whether a piece of text conveys a positive, negative or neutral sentiment. They can assist in summarizing long pieces of text. LLMs can also be used in tutoring systems, providing explanations for complex topics in a variety of subjects. The possibilities are endless and constantly expanding as these models continue to evolve and improve. Another major benefit of large language models is their ability to perform impressively with minimal training data tailored to a specific problem. They can deliver quality results even when provided with a small amount of domain specific data. This quality makes them highly adaptable to few shot or zero shot learning scenarios. Now, let's not get confused here. Let me explain what is the difference between shot learning and shot prompting. As we discussed in the prompt engineering video, shot prompting involves giving a shot of context before the actual instruction. This added context can guide the model to produce more emotionally charged and context aware output. We also said that there are different types of shot prompting, zero, one, and few shot prompting. Zero shot prompting tasks the GNAI model without prior examples. One shot prompting provides a single example, and few shot prompting provides multiple examples to assist the model. In the context of machine learning, few shot learning refers to scenarios where a model is trained on a limited set of data. This process is particularly beneficial in situations where large amounts of training data are not available or practical. On the other hand, zero shot refers to an even more impressive capability of models. It implies that a model can identify and understand concepts or tasks that it has not been explicitly trained on. It's like having an intelligent system that can make logical assumptions and deliver solutions based on the knowledge it has gained, even when faced with completely new scenarios. So we can see that LLMs can excel even in zero shot scenarios thanks to their training on vast datasets. They can handle new situations by leveraging their extensive knowledge to infer appropriate responses, even without having directly encountered the specific scenario in their training data. In essence, LLMs can be quickly adapted to a wide range of tasks, even when those tasks are outside of the specific domain that the model was originally trained on. This adaptability opens up a world of possibilities for using these models in diverse fields and applications. A key benefit of LLMs is their consistent improvement as we increase the amount of data and the number of parameters involved in their training. For example, consider the journey from GPT 3.5 with 175 billion parameters to GPT four with an estimated 170 trillion parameters. The exponential increase in the parameter count led to a notable advancement in the model's capabilities, understanding, and precision. This growth trend suggests that LLMs can evolve even further as we continue to push the boundaries of available data and computational resources. GPT four significantly outperforms GPTs 3.5 due to its larger number of parameters. It shows superior understanding of contexts and nuances, delivers more accurate responses and performs better in translation and summarization tasks. In addition, GPT four is more capable in understanding complex instructions without having to breaking them down into smaller steps, showcasing its superior ability in adapting to zero shot and few shot learning scenarios. And the fourth benefit of LLMs is that by interacting in natural language, they improve accessibility to AI for anyone with a basic computer, eliminating the need for specialized technical skills. Whether you're a student looking for help with homework, a writer needing inspiration or a business owner seeking market trends analysis, LLMs are here to help. Their speech recognition and human voice synthesis abilities opens up possibilities for those who might struggle with typing or even individuals who cannot read and write. In addition, their high quality translation capabilities remove language barriers, making these powerful tools usable to people from diverse demographics and backgrounds. Essentially, LLMs are transforming the way we interact with technology, bringing complex AI abilities to a broad range of users worldwide. In conclusion, large language models are breaking down barriers, making AI accessible to all. With their versatile capabilities and ever evolving potential, LLMs are revolutionizing the way we interact with technology. As we look to the future, we're sure to see these models continue to enhance our lives and work in unimaginable ways. In the next video, we look a little deeper into three examples of LLMs Palm and Lambda from Google and GPT from Open AI. See you in the next one. 12. L3V3 - Examples of LLMs: In this video, we look at some examples of large language models. We get into the details of three state of the art LLMs, Palm, Lambda, and GPT, and we will also discuss some other LLMs, which have shown promise in the field of generative AI. So let's start with Palm, which stands for pathways language model. Palm is a 540 billion parameter language model developed by Google AI. It is trained on a massive dataset of texts and code and can perform a wide range of tasks, including question answering, natural language inference, code generation, translation, and summarization. It utilizes Google's pathways system, which enables it to be trained on a massive dataset of text and code. With 540 billion parameters, Palm is one of the largest language models in the world. It is a dense decoder only transformer model, which means that it is specifically designed for natural language generation tasks. Palm can achieve a state of the art few shot performance on most tasks, which means that it can learn to perform a new task with only a few examples. This makes Palm a powerful tool for a variety of applications. So what is pathways system? Pathway system is an AI architecture that stays highly effective while generating across different domains and tasks. It is able to effectively train a single model across multiple TPU V four pods, which are Google's custom design machine learning accelerators. This allows Pathway system to handle many tasks at once, reflect a better understanding of the world, and learn new tasks quickly. Pathway systems achieves this by using a number of techniques, including model parallelism. This technique allows multiple models to be trained on the same data simultaneously. This can improve the training speed and efficiency of pathway system. Data parallelism. This technique allows multiple copies of the same model to be trained on different datasets. This can improve the accuracy of pathway system by allowing it to learn from a wider variety of data and automchine learning. This technique allows pathway system to automatically optimize its training parameters. This can improve the performance of pathway system by preventing it from overfitting to the training data. Pathway system is still under development, but it has the potential to revolutionize the way we build and deploy AI models. By enabling models to orchestrate distributed computation for accelerators, Pathway systems can make it easier to build and train large complex AI models that can handle a wide variety of tasks. Next, let's discuss Google's another LLM, Lambda. Lambda stands for language model for dialog applications. Lambda is a family of neural language models developed by Google AI. It is trained on dialogue and has up to 130 billion parameters, pre trained on a dataset of 1.56 trillion words. Lambda has three key objectives of quality, safety, and groundness. These objectives are measured by metrics such as sensibness, specificity, interestingness, and informativeness. Lambda is designed to be informative and comprehensive while also being safe and grounded. It is able to generate different creative text formats like poems, code, scripts, musical pieces, email, letters, and much more. It will try its best to fulfill all your requirements. Lambda has the potential to revolutionize the way we interact with computers. It can be used to create more natural and engaging dialogue experiences and to provide users with more helpful and informative assistance. As we said earlier, there are three key benefits in Lambda. One, natural and engaging dialogue. Lambda can engage in natural and engaging dialogue with humans. It can understand the context of a conversation, and it can respond in a way that is both informative and interesting. Two, helpful and informative assistance. Lambda can provide users with helpful and informative assistance. It can answer questions, generate creative text formats, and follow instructions. And three, safe and grounded. Lambda is designed to be safe and grounded. It is trained on a massive dataset of text and code and is able to distinguish between safe and unsafe content. And now let's move on to OpenAI's GPT, which is short for generative pre trained transformer. GPT is a type of deep learning model used to generate human like text. It was developed by OpenAI, a nonprofit research company and is funded by Microsoft. GPT utilizes a transformer architecture, which is a type of neural network that is well suited for natural language processing tasks. Parameters for the latest version of GPT, which is GPT four are undisclosed, but it is likely much larger than GPT 3.575 billion parameters. This means that GPT four has a greater capacity to learn and understand language. GPT four has also shown to be proficient at skills assessments such as the bar exam. In a recent study, GPT four scored in the 90th percentile on the bar exam, which is a standardized test that is required for admission to the bar in many jurisdictions. GPT is a powerful tool that can be used for a variety of tasks, including generating text, answering questions, translating languages, and much more. In addition to what we discussed so far, there are other LLMs that are transforming how we look at AI and are helping shape the future of the field. Let's briefly review some of them. The first one is touring NLG by Microsoft, a larger scale language model trained on diverse Internet text that is capable of writing coherent paragraphs and even whole articles. Burt by Google, a revolutionary transformer based model that is pre trained on a large corpus of texts and fine tuned for various natural language processing tasks, providing a high level of understanding of context and semantic meanings. Transformer XL, a language model developed by Google's brain team that innovatively handles long term dependencies in sequences, significantly enhancing the performance of tasks like text generation and translation. There's also Excel Net, which is an extension of transformer xl, developed by Google Brain and Carnegie Melon University. It uses a permutation based training method to overcome some limitations of BIRT and outperform it on several benchmarks. Electra is a highly efficient pre training approach developed by Google research that uses less compute power for similar or even better performance than models like BRT. We have Megatron transformer, a transformer based model developed by Nvidia, designed to train very large language models with billions of parameters. It leverages parallel processing capabilities of modern GPUs. And we have ama introduced by meta, Lama is a foundational, smaller yet performant large language model. It's designed to broaden access to AI research, requiring less computational power and resources for testing new approaches and validating existing work. Lama can also be available in different sizes, ranging 7000000000-65 billion parameters. So in this video, we dived into the exciting advances in large language models, focusing on Google AI's Palm and Lambda, as well as OpenAI's GPT. We also highlighted other noteworthy models in the field, such as touring Energy, BRT, XLNt and mega tron Transformer. We discussed how these LLMs with parameters that can reach hundreds of billions redefine multitask learning and revolutionize our interaction with computers through engaging dialogue. These models have already demonstrated their exceptional abilities in tasks like text generation and even practical assessments like bar exam, and they only keep getting better. 13. L3V4 - Foundation Models: In this video, you will learn about foundation models, which as the name suggests, provide a foundation for generative AI models. Specifically, we start by providing a definition for foundation models and explaining what they are. Then we will talk about platforms that provide different kinds of foundation models with a focus on vertex AI's model garden and end the video by discussing different types of foundation models. Now, what are foundation models? Let's ask Bart to help us answer the question. It provides three different drafts. Foundation models are large pre trained neural networks that can be fine tuned for a variety of tasks such as natural language processing, computer vision, and speech recognition. Foundation models are large language models trained on massive datasets that can be fine tuned for a variety of downstream tasks such as translation, question answering, and summarization. Foundation models are large pre trained machine learning models that can be adapted to a wide range of tasks such as natural language processing, computer vision, and robotics. So foundation models are large AI models that can be adapted to a wide range of tasks and can generate high quality output. Even though AI models aren't brand new to us, there's something quite different about these foundation models. They come equipped with several key characteristics that set them apart, marking a significant shift from the AI models we've seen in earlier generations. Foundation models aren't limited to a single task. They're multitask. A single foundation model can tackle a wide range of tasks right out of the box, such as summarization, question answering or classification. They can handle various modalities of data types, including images, text, code, and much more. With minimal or no training at all, foundation models can work well out of the box. They can also be tailored for specific use cases using only a handful of example data. Because they are typically trained on vast amounts of diverse data, these models can learn general patterns and representations, which can then be applied across various domains and tasks. Before now, foundation models were difficult to access. Required specialized machine learning skills and compute resources to use in production. But with the recent wave of advancements in generative AI, things are changing dramatically. For example, take Vertex AI, a fully managed machine learning platform available on Google Cloud. If you are already familiar with Google Cloud's tools, you already know that Vertex AI enables you to access, build, experiment, deploy, and manage different machine learning models. Things like traditional data science, machine learning, MLPs, or simply creating an AI driven application. Vertex AI is equipped to support all such workloads. That's pretty cool and all, but this is where things start to get really interesting. Recently, Google Cloud announced two major tools which enable us to do even more model garden and generative AI studio. These tools make foundation models available to a much broader audience, even without much experience with coding and ML development. The last section of this course is dedicated to introducing Google Cloud Gen AI tools, and we will talk more about generative AI studio in that section. So in this video, let's only focus on model garden. What is exactly where the XAI Model Garden? It's a single place to explore and interact with both Google's industry leading models, as well as popular open source models, or with Google Cloud's enterprise EmlopsToling support built in. It houses both traditional machine learning models and foundational models for generative AI applications. Inside Model Garden, you'll find a range of models from Google Cloud, Google research, and various external sources, accommodating a variety of data formats. So this is what inside Vertex AI's Model Garden looks like. With many different enterprise ready models at your disposal, Model Garden enables you to select the most suitable model, depending on your use case, your expertise in ML, and your available budget. Please keep in mind, we are using Vertex AI's Model Garden as an example of a platform that Google Cloud provides for different generative AI and other machine learning tools and APIs. There are other companies which have their own versions of Model Garden as well. For example, Amazon Sagemaker, IBM Watson assistant, and Vida Clara, data IKAI, Open AI chat GPT API, Microsoft Azores Machine Learning, data Robot AI and Databricks LakehousePlatform, all provide tools and APIs for both traditional machine learning models and generative AI foundation models. There are different types of foundation models, including text generation and summarization, chat and dialog, code generation and completion, image generation and modification, and embeddings. Now, let's have a deeper look into each of them. Text models. These models help you perform natural language tasks with zero or few shot prompting. They can do tasks like summarization, entity and information extraction, idea generation, and much more. For instance, a journalist could use text models for summarization of large articles or reports. An academic researcher could extract specific entities of information from a massive corpus of papers, or in a brainstorming session, an entrepreneur might use the model to generate new ideas or perspectives. As mentioned earlier, these models work effectively right out of the box. However, if you desire the model to follow certain specifications, you can provide structured examples to guide its responses. This allows for a tailored experience that aligns with your specific needs and objectives. Next, let's focus on dialogue. These models are also text based, but they've been fine tuned to hold a natural conversation. Dialogue models allow you to engage in multiple turn conversations, keeping the context throughout the interaction. Consider a scenario in a customer support center, an AI chatbot powered by these dialogue models can assist customers remembering the previous turn of the conversation and providing context aware responses. It can answer questions, summarize information, or even guide users through complex procedures, or while being fine tuned to your specific domain. These models can help you build powerful tools that greatly enhance the user experience, whether deployed on a browser, a mobile app, or other digital interfaces. Moving on to code completion and generation. These models act as your supercharged coding assistant. You can give a natural language prom to describe a piece of code you want written, or you can use the model to auto complete a piece of code. There are even extensions for IDEs that can take a partial code snippet as input and then provide the likely continuation. Imagine you're working on a complex software project. It can help eliminate the tedious aspects of coding, even provide help with debugging your code, allowing you as a developer to focus more on the creative problem solving and less on syntax or routine code. And now let's dive into image generation. These models allow you to generate and edit images according to your specifications. In addition, you can utilize these models for media related tasks, such as classification, object detection, and more. Plus, such models typically incorporate content moderation mechanisms to ensure responsible AI safety practices. Imagine you're building an e commerce platform. A model for object detection can automatically tag items in the product images, while an image generation model can create new product images based on descriptions. The content moderation feature would ensure all user generated content aligns with your platform's policy, enhancing user experience, last but definitely not least. Let's talk about embeddings, which might sound a little complex, but it's actually a really cool concept. So let me explain it for you. Imagine you have a huge basket of fruits and you want to sort them out. You could sort them by color, size, weight, or even taste. Similarly, in the world of data, we often need to sort or categorize things. But the things we're dealing with are words or phrases, not fruits. That's where the embeddings come in. They are like a unique ID card for every word or phrase, but instead of a card, it's a list of numbers, which we call a vector. This list of numbers captures the essence of that word or phrase. Is meaning is context and its relationships with other words. With embeddings, we can make sense of unstructured data, like a long book or a Twitter feed and use this understanding to do things like powering recommendation engines or targeting advertisements more effectively. For example, consider the realm of e commerce. A embedding model can be used to power recommendation engines, matching users with the products they are most likely to be interested in based on their browsing history. Or in digital marketing, these models can enhance ad targeting systems, enabling highly personalized advertising. They can also be used for complex classification tasks, search functionality, and many other applications. So in conclusion, foundation models represent a significant leap forward in AI technology. They offer a powerful adaptable base that can be used for a wide range of tasks right out of the box. With platforms like Vertex AI's Model Garden, these tools are more accessible than ever before, putting advanced AI capabilities into the hands of a much wider population of users. From natural language tasks to multi turn dialog, code implementation, image generation and modification, and semantic information extraction, the potential applications of these models are vast. Whether it's enhancing customer service with AI chatbots, assisting developers with auto generated code, or powering recommendation engines, foundation models are shaping the future of AI. With foundation models and the power of generative AI, we're not just predicting the future. We're building it. In the next video, we see some of the amazing applications that different types of generative AI models offer. 14. L3V5 - LLM Development: Let's talk about how large language models are developed. In this video, we start by providing a comparison between LLM development and traditional machine learning development. Then we will talk about three main kinds of LLMs, and at the end, discuss a concept called chain of thought reasoning and how that can help with designing better prompts for LLMs. Let's kick things off by comparing the development of LLMs using pre existing models with the traditional approach of machine learning development. In the LLM world, there's no prerequisite for technical expertise or extensive training examples, and guess what? You can forget about model training, too. It's all about the art of prompt design, clear, concise, and full of useful information. The other hand, traditional machine learning requires you to roll up your sleeves and dig deep into training examples, model training, and even sometimes needing a basic knowledge of hardware and computing power. There are three main types of LLMs, generic, instruction tuned, and dialog tuned. Each of these models require its unique style of prompting. Generic language models work like your phones autocomplete, predicting the next word based on the training data's linguistic patterns. Instruction tuned models, on the other hand, are responsive to specific directives, whether it's summarizing a text, generating a poem in the style of a famous poet, or offering a sentiment analysis of a statement. These models follow the instructions embedded in the input. Lastly, we have dialog tuned models. These are a subset of instruction tuned models specifically designed for interactive context, much like a chat with a bot. So let's dive into examples of these three kinds and see them in action. Before jumping into examples of different kinds of LLMs, let's provide a definition for tokens. A token is a unit of data that the model processes. It can be a word or part of a word. We'll begin with the generic language models. They're pretty straightforward. Their primary task is to predict the subsequent word based on the context provided by the training data. Let's take a simple example. The cat sat on and now we want to know what the most probable next word is the model tells us the is that answer. Just like how your phones autocomplete features would suggest. It's a fascinating glimpse into how AI can mimic the way we naturally communicate. Moving on to instruction tuned models, these models shine when it comes to generating responses. They take their cue from the instructions given in the input, whether it's a request to summarize a text, generate a poem in a particular style, or even classify a text sentiment. It's like having your very own digital assistant, always on standby to carry out your instructions precisely. And lastly, we have dialect tuned models, which are a specialized type of instruction tuned models. However, they aren't just waiting for instructions. They're trained to engage in a back and forth conversation. You might typically encounter them in the form of chatbots. If you've ever asked a virtual assistant a question, it's likely you've interacted with this kind of model. It's all about enabling a natural conversational interaction. Now it's time to explore an interesting concept, the chain of thought reasoning. This is an observation that models are more accurate in producing correct answers when they first generate a reasoning pathway or chain leading to the answer. Let's consider a simple example. Roger has five tennis balls and buys two more cans, each with three balls. How many balls does Roger have now? Initially, the model might struggle to provide the correct answer. However, after presenting the problem a second time, the model becomes more likely to conclude with the correct answer. The chain of thought reasoning assists in enhancing the understanding and response capabilities of large language models. In conclusion, the development and deployment of large language models open up exciting new avenues in the world of machine learning. As we continue to improve and refine these technologies, we anticipate a future where advanced language comprehension by AI drastically changes our interaction with digital platforms. Now that we understand how LLMs are developed, it's time to see why tuning them for specific tasks is important and how we can tune LLMs in an efficient way. See you in the next video. 15. L3V6 - Tuning LLMs: In this video, we will talk about the importance of tuning LLMs for specific tasks and how to do it efficiently. It's an interesting thought to have a model that can handle everything. But in practice, LLMs come with their fair share of limitations. To increase their reliability and efficiency, LLMs need to be fine tuned for specific tasks and on specific domain knowledge. Just like a professional athlete specializing in their sport, these models need to refine their skills to master their performance. Let's start with a simple task example. Question answering. This is a subdomain of natural language processing that's about automatically answering questions posed in everyday language. These Q&A systems are powerhouses capable of tackling a range of questions from factual to opinion based thanks to their extensive training on text and code. However, the secret ingredient for this model success is domain knowledge. Consider this. When you're developing a QA model for customer support, healthcare, or supply chain, domain knowledge becomes a critical requirement. In customer support, a domain tuned LLM could provide insightful information on subscriptions and services, ensuring your clients receive efficient AI assisted service. In the realm of education, these models can offer detailed information about courses, tuition fees, or academic policies. For healthcare, they could serve as self management tools for patients, providing critical health related information. Retail businesses could benefit from better AI chatbots and product visualization, elevating the customer experience. And in the realm of supply chain management, LLMs could offer valuable logistics information and inventory insights. And let's not forget about the big tech companies. They could use these models to provide superior tech support to customers. Each sector has its own unique requirements, and tuning an LLM, according to these specifications, can drastically enhance the model's effectiveness. While generative Q&A models can use their train knowledge base to answer questions without needing specific domain knowledge, Fne tuning these models on domain specific knowledge significantly boost their accuracy and reliability. It's like providing the model a detailed map of the terrain it's supposed to navigate. Take Vertex AI as an example. It provides task specific foundation models that are already tuned for a variety of use cases. Let's say you want to understand your customer's sentiments toward your product or services better. Vertex AI has a sentiment analysis task model that is just right for the job. Perhaps you're in the retail or real estate sector, and you need to perform occupancy analytics. There's a task specific model designed for that as well. These models honed for specific tasks, demonstrate the value of tuning. They're more efficient, targeted, and effective at their respective jobs. The ability to select and use a model that aligns with your specific needs can dramatically enhance the overall effectiveness of your AI solutions. Okay, now it's time to formally define what we mean by tuning. Tuning refers to the process of adapting a pre trained model to a more specific task such as a set of custom use cases or new domain by training it on new data. Tuning is achieved by training the model on new data that's relevant to the task at hand. For example, if we're working within the legal or medical sector, we would collect training data from these domains to tune our model accordingly. But what is fine tuning? Think of fine tuning as a high precision adjustment to the model. You bring your own dataset and retrain the model, affecting every weight in the LLM. It can be a labor and resource intensive job and requires hosting your own fine tune model. And that can make it impractical for many use cases. But it's important to know that a fine tune model is equipped with a high level of accuracy and specificity. Let's take a real world example to illustrate the power of fine tuning. Picture a healthcare foundation model that has been extensively trained on a wide range of healthcare data. It can perform various tasks seamlessly, answering medical questions, analyzing medical images, finding patients with similar conditions, and much more. The reason behind this success is fine tuning with domain specific knowledge. By doing so, it becomes a specialist rather than a generalist, delivering precise and reliable results within the healthcare context. This process underscores the immense potential and versatility of tuning, transforming a one size fits all model into a highly specialized tool to navigate complex healthcare scenarios. Fine tuning is a great way to boost a model's performance, but similar to renovating a whole house, it can be expensive and not always practical. So if we're looking for a more efficient way to tune large language models, what can we do? One approach to follow is parameter efficient tuning methods or PETM for short. Think of PETM as giving your model a makeover instead of a complete renovation. Normally, with fine tuning, we adjust all of the parameters of the model, which is complicated and time consuming. But with PETM, we focus on changing just a small subset of these parameters or even adding a few new ones. Maybe we add on some extra layers to the model or throw in an extra piece of information. Figuring out the best way to do this is still a hot topic among researchers. The key takeaway here is that PETM is like a shortcut. It helps us avoid the need to retrain the entire model, saving us time, effort, and resources. Plus, it even simplifies the process of using these models later on as we just use the base model and add on our extra bits. We've now reached the conclusion of our exploration into the world of large language models. From this section, we've gained valuable insights into LLMs, starting with an introduction to their structure and function. We discussed the numerous benefits of using LLMs and provided some examples for them, including Palm, Lambda and GPT. We also talked about the process of LLM development, highlighting how it's different from traditional machine learning development. And most importantly, we underscored the significance of tuning LLMs. Diving into the ways it enhances their reliability and accuracy. We've seen how domain specific knowledge can significantly improve their performance and learn about efficient tuning methods like parameter efficient tuning methods. In the next section, we will familiarize ourselves with four major tools on Google Cloud that enables us to access and fine tune generative AI models and build our own generative AI applications. 16. L4V1 - App Sheet: Let's talk about apshet an innovative no code platform from Google that is leveraging the power of generative AI to transform app development globally. Imagine a world where anyone, irrespective of their coding skills can quickly create data centric apps for Google workspace. That's exactly what apsheet is designed to do. Remember the tedious process of traditional app development, from conceptualization to drafting project specifications, and from team collaboration to coding, it was a long and exhaustive journey. But with apsheet the app development life cycle has been streamlined drastically. What used to take months can now be accomplished in days or even hours, freeing up your time for more valuable tasks. The beauty of Apshet is its versatility. It enables you to create apps for multiple platforms, including desktop, mobile and chat applications. Apsheet provides an array of applications that are as varied as your specific needs. It might be that you're managing a warehouse and need a streamlined solution for inventory tracking. Perhaps you're organizing a major corporate event and need a detailed event planning tool, or maybe you're running a massive multifaceted marketing campaign and need an app to coordinate the many moving parts. From customer relationship management to supply chain coordination, employee scheduling to project management. Apsheet is adaptable enough to handle your specific needs. Its flexibility and versatility open the doors to countless scenarios, making it a go to platform for developing customized data centric apps. As long as you have a clear idea about what you need and you can explain it in natural language, apsheet can help you turn that idea into an application. So let's dive deeper. Recently, Apshet introduced new capabilities all powered by generative AI. Now, you can turn your idea into a fully functional app within minutes, and you can do this using natural language. For instance, you want to create an app for tracking travel expenses. All you need to do is to describe your process to apset. Aphet then takes over asking follow up questions to better understand the requirements of your app. Once Apsheet has gathered enough information, it presents a preview of the tables for your app and even provides sample data to help you test that. Then Upseet proceeds to build the starter app for you. As soon as the app is ready, you can launch it, try it out and make any necessary adjustments. Interestingly, you can continue using natural language to specify the changes you want, and apsheet will assist you in refining your app. Creating an app through natural language with zero coding, that's the magic of apsheet. It empowers anyone to develop applications for their organization rapidly and efficiently. So let's see how Apsheet leverages generative AI to make this possible. When the user interacts with apheet, Dialogflow and generative AI, aided by a custom trained LLM, work together to provide the necessary information to create the app. Dialect flow gathers essential information about the user's business problem that Apsheet will use to construct a starter app. Apsheet tries to make this app align as closely as possible with the user's ideal solution. After Direct flow has gathered the required information, Apsheet sends a request to the LLM to assist in generating the data model and views needed for the app. When the LLM provides the right Schema, apsheet utilizes all the collected information to build a starter app in just a few minutes. The delivered app includes a comprehensive database, an intuitive app interface, and any specific configurations that were expressed during the interaction, such as notification preferences. Once the starter app is ready, users can continue collaborating with Apsheet to fine tune and improve the app further. Depending on the complexity of the request, apsheet may utilize both Dialogflow and the LLM during this interaction. The combination of Dialect flow and the LLM enhances the capabilities of Apheet allowing it to handle even the most complex app development requests. You can even customize these two technologies. For dialect flow, you can customize it to help you create conversational chat interfaces. Here's how you do it. First, you create a dialect flow agent. Then define your customized intent and entities. Once that's done, the dialect flow API is there to help you integrate this agent into your application. In essence, you're tailoring a piece of sophisticated technology to your specific needs. For the LLM, you can design a model to cater to your application's unique demands with the Vertex AI generative AI Studio. This platform presents a range of foundational models from Google Cloud that you can refine according to your needs. You can achieve this by formulating and adjusting prompts as necessary and honing the models using your own data. So in conclusion, by benefiting from generative AI technologies, specifically Google's Dialect flow and a custom trained LLM, Appset allows any individual to develop data driven apps with no coding experience and in a very short time. It has become a powerful platform that allows users to generate different apps using natural language. And that concludes this section on four powerful generative AI tools available on Google Cloud. In the next and last section of this course, we will try to use these tools to build our own application using the power of generative AI. 17. L4V2 - Gen App Builder: The Jen App Builder available through Google Cloud's Vertex AI masterfully blends foundation models with the strength of search and conversational AI, empowering a new range of users to craft innovative generative AI applications in no time and with no coding skills required. The human like and engaging nature of online interactions presents an opportunity for users to enhance their connections with their potential audience. For enterprises, this means better communication with customers, employees, and partners. With Jen App Builder, creating these powerful GN AI applications does not require any coding at all. Think about crafting your personalized digital assistant, custom made search engines, knowledge bases, educational applications, and much more. With Jen App Builder, you hold the power to bring such visions into life. The Jen App Builder features a user friendly dragon drop interface, making the process of app design and development much smoother. It has a visual editor which lets you easily create and modify your app content. The built in search engine allows users find information within the app, while the conversational AI engine allows interactions in natural language. So the Jen App Builder provides the flexibility to create an enterprise search experience, a conversation or chat experience, or even both. The process is simple. You start by building a content source, which could be a word document or a spreadsheet containing information about your business. Next, you select the features you'd like to incorporate into your app. This could be search, chat, or both. And once you're done, simply hit Create. But wait, there's more. This app also allows you to control and customize generated responses or create default responses. If that isn't enough, you can always access granular control over the responses with options to control the response type, set forbidden terms, and disable generated responses when needed. But don't worry. Even with generated responses disabled, your GAIPowered app can still answer complex questions thanks to Google's search technology. Jen App Builder also has the capability to complete transactions on behalf of the user. With the integration of pre structured flows for common use cases such as checking order status or explaining bills, you can effortlessly add these functions to your app with just a single click, but it's not limited to only providing predefined functionalities. It lets you create your unique transaction flow with the help of a simple graph based interface to outline high level business logic. If you prefer, you can even make use of prompt based flow creation to explain your logic using a straightforward natural language. Once you're happy with your app configurations, it's ready for testing. And if everything looks good, Jen App Builders in built integrations facilitates a seamless launch of your app on your website or popular messaging platforms. It also offers connectivity with telephony partners. To deploy your new app, you just need to get the widget Deployment code. It's as simple as that. So you can see that Jen App Builder allows you to easily publish your conversation or searchbt to a website or connect to popular messaging apps. Jen App Builder leverages the power of AI enabling you to create chatbots that can handle tasks like answering domain specific questions, processing multimedia inputs, and delivering multimodal responses. These chat bots can guide users to relevant content and deliver generative AI responses even without specific domain knowledge. And they can complete transactions, summarize information using AI, and have the flexibility to pause and resume conversations whenever needed. With Jen Apbilder, you're crafting digital assistance that redefine the standards of online interactions. In conclusion, Jen Abuilder is where the strengths of Google's current foundation models, enterprise search, and conversational AI come together. It empowers you to effortlessly build advanced applications that redefine online experiences. It's user friendly interface and visually appealing editor pave the way for creation and modification of app content with minimal effort. With capabilities ranging from built in search engines to a conversational AI engine and a user friendly and intuitive interface, Jen App Builder offers an extensive toolkit for building responsive dynamic apps. You can build chatbots capable of multimedia input, multimodal responses, and domain specific questions. And with the ability to complete transactions and pause and resume conversations, these chatbots are more than just ordinary bots. They are designed to handle the most complex tasks and interactions, all while being easy to publish and connect to your website or popular messaging apps. Now let's move on to the next interesting App Builder available on Google Cloud app sheet. 18. L4V3 - Maker Suite: Maker SID is an intuitive browser based tool designed to enable rapid and user friendly prototyping with Palm to model. The integration of Maker SUID with Palm API means that now we are able to access the API through a user friendly graphical interface. The Palm API is a gateway to Google's large language models and generative AI tools, facilitating time efficient and accessible prototyping. This platform lets you test out models rapidly and experiment with different prompts. You can use it to create and fine tune your prompts, add synthetic data to your custom dataset, generate cutting edge embeddings, and adjust your custom models with ease. And if you come up with something you are happy with, make your Suite offers you the capability to turn it into Python code, making it possible to call the model using the Palm API. Palm API and Maker Suite are the perfect duo for generative AI development. Palm API is your starting point to access Google's LLMs, giving developers the freedom to use models optimized for various tasks. On the other hand, Maker Suite provides an intuitive interface to start prototyping and creating your unique apps. Now, let's have an inside look into Maker Suite and see how we can start prototyping with large language models in just minutes. So this is how inside Maker Suite looks like. Let's check out this menu on the left. Here, we can create new prompts. As we can see, there are three different types of prompts that we can create text prompts, data prompts, and chat prompts. We also have access to our library, which is the current page that's open right now. If you go to get APIKey, we can see that here, we have the option to create an API key for a new project. And there are also other quick links available. There is a guide for getting you started. There is a prompt gallery that helps you explore different kinds of prompts. There is API documentation and some more information about privacy policy and terms of service. Now let's get back to our library and try different prompts. The first one is text prompt. Let's try it. So here, there are interesting things to explore. The first thing to notice are these sample prompts. There are some examples to help us get a better idea of how these prompts could look like. Also, if you pay attention to textbooks, we can see that there are some examples provided for us. Let's read some of them. Categorize an apple as fruit or vegetable. Write a JavaScript function and explain it to me. Paraphrase, it looks like it's about to rain and many other more examples. It just shows what kind of prompts you can use as examples of a text prompt. Now, let's explore one of the samples that are provided here. Let's check out casual ponderings. So the prompt would be rewrite this into a casual email, and then you provide a text for an email. I can click Run and now I can see that the language model created a response to my prompt. Let's explore the other kind of prompt, data prompt. So here, we can see that there are two different parts. The first one is a table for writing our prompt examples. And the second part is to help us test our prompt. So let's look at an example and see how it would look. Let's try opposites. In the examples, we see that we are providing four different examples of what each of these inputs should receive as an output. So if our prompt is find a word or phrase with opposite meaning, then we can provide examples like if the input is strong, the output should be weak. If the input is thick, the output should be thin and so on. After providing these examples, we can test our prompt. Now, we ask the language model. If the input is wrong, what would be the output? And if the input is fast, what should the output be? And now if we run, we see that in response to wrong, the language model is creating right, and for the input fast, the language model creates slow. We can see that for every input, the language model creates the opposite as the output. Let's explore the third type of prompt, chat prompt. Here, we can also see that there are two parts. There is a part for writing our prompt examples and another part for testing our prompt. So let's look at some of these samples. Let's try chat with an alien. So in the example, we provide some context. Be an alien that lives on one of Jupiter's moons and provide an example conversation. If the user says, how's it going, the model should say, I am doing well and so on. If you want to add more examples, we have the option down here. And now we can test our model. So in response, we say, I'd like to visit. What should I do? But the model provides an answer which is relevant and is continuing the conversation. We can keep interacting with the model by writing more prompts. We also have some options for tuning the model below. The first one is a text preview of the same prompt we are working on. Whether it's a table prompt or a chat prompt, we can always have access to the text version of the same prompt. Through the other one, we can fine tune our model. We can choose what kind of model we want to use. We can set the temperature that defines the level of randomness or creativity of the model, and we can also customize the number of outputs the model should produce. There are also some more advanced settings available. So to recap, first, we select our prompt type and enter a prompt, including any examples and instructions. Whatever type of prompt you use, you always have the option to see it in text form. If you need to test the models output, make your suit makes it simple for you to reuse prompts in different ways by using test inputs in your prompts. We also have the flexibility to play around with the model parameters. For instance, there's an option to tweak the temperature setting, which influences the element of randomness in the models responses. A higher value here often leads to more unexpected or even creative outputs. We can also make additional adjustments to parameters such as stop sequences, number of outputs, and so forth. And finally, after you are happy with your prompt, you can save, share and even export it to different developer environments. For saving your prompts, Maker Suite offers a prompt library feature, acting as a secure storage space for all your prompts, making them easily retrievable for future references. You can also save your prompts on your Google Drive. Sharing your prompt is as simple as clicking the Share button. And if you're looking to export your work to a developer environment, just hit the Get code button. You can export your prompts in the format that suits you. Python or JavaScript code, JSON Objects, or even as a CURL command. Your work in Maker Suite, including the settings, instructions, and test examples are all stored in this code snippet. So in conclusion, the combination of Palm API and make your Suite offers an incredibly convenient and user friendly approach to prototyping with large language models. They place the power of generative AI in your hands, providing the flexibility to experiment, tweak, and refine until you've crafted the perfect AI driven application. See you in the next one. 19. L4V4 Generative AI Studio: As the excitement around generative AI is growing, we can see that its power to speed up the application prototyping process is a game changer. If you have access to the right tools such as the Generative AI Studio and other GNAI capabilities that are available now through Vertex AI on Google Cloud, you can experiment, adapt, and perfect new ideas in a Snap. And by Snap, I mean minutes or hours instead of weeks and months. Building an app is as easy as opening the generative AI studio in the Vertex AI section of the Google Cloud Console. Selecting the modality you want to work with, choosing your preferred format and inputting your prompt and adjusting model parameters for additional control. With Generative AI Studio, you get the chance to explore and tailor generative AI models that perfectly fit into your Google Cloud applications. You can even embed these applications to your website or mobile app. In this video, we are going to explore generative AI Studio available on Vertex AI. But before that, let's briefly see what other tools are available on Vertex AI. So this is how inside Vertex AI looks like. If we expand the menu on the left, we can see all the tools that are available to us. We can see that we have access to model garden, workbench, and pipelines. We also have generative AI studio, which we will talk about shortly. In addition to that, we have tools for data management, model development, and model deployment and use. Generative AI Studio helps developers create and deploy models by providing tools and resources that make it easy to get started. Generative AI Studio lets you quickly test and customize a variety of Google's foundation models through prompting and tuning and allows you to easily deploy your tuned models. Inside Generative Va Studio, you can access Google's language, vision, and speech foundation models. The availability of some modalities varies. For example, you can see that at the time of recording this video, I do not have access to vision models. So let's focus on language and speech. Let's focus on language for now. You can either click the language from the menu on the left or open button at the bottom of the language box. If you want to get a better idea of how you can use GN AI Studio for different purposes, you should explore the prompt gallery. So before exploring the different types of prompts, let's have a look at the prompt gallery. Here we can see a variety of sample prompts that are predesigned to help demonstrate model capabilities. The sample prompts are categorized by task type, such as summarization, classification, and extraction. Let's have a look at an example. When you open the sample prompt, you can see that the prompts are preconfigured with a specific model and parameter values. So you can just click Submit and get the model to generate a response. To directly work with the language models, we have three options. Interact with the model in a free form or structured prompting, interact with the agent as a chatbot or create a tuned model that's better equipped for our test cases. Let's explore the text or code prompt in a free form format. So let's try design and test your own prompts. Here, I can give the model a prompt and ask it to produce a response. I just provided a long article here and I'm asking the model to provide a brief summary for the following article. For different kinds of prompts, I can also use my microphone and directly speak to the model. On the right side, we can also see that there are some settings that we can use to configure the model. We can choose what type of model we want to use. Here, we have two language models and two de models. We can set the temperature for the model, which controls the degree of randomness or creativity. We can also set the token limit, which determines the maximum amount of text output from one prompt. Top K changes how the model selects tokens for the output. Top P changes how the model selects tokens for the output, and we can also set different safety filter thresholds. So now we can ask the model to produce a response for our prompt. Let's click Submit. And we can see that the model summarize the long article into three lines. If you are doing a few shot prompting, a structured prompt template is available to make it easy by providing a form for context and examples. For structured prompts, let's get back to our wine classification example. We can provide some context to the model, which instructs how the model should respond. We can also provide multiple examples to the model. These examples help the model understand what an appropriate model response looks like. We also have our settings on the right side. We also have the option to add more columns for more complex examples. And to test the model, we provide an input, whether by writing it in the input section or directly talking to the model. And when I click Submit, the model generates a response for me. We can easily convert any structured prompt to free form. And that's how it looks like. You can choose to initiate a text or code chat to start a conversation with the model. You can provide context and examples of interactions to further direct the conversation. All the settings for model configuration are available here as well. Now, let's try a chat prompt. In chat prompt, we have the option to provide some context to the model, which instructs it on how it should respond. We can also provide examples to help the model understand what an appropriate response would look like. For example, if the user says this, the model should say this. We also have the option to provide more examples to the model. After providing enough context and examples, we can start chatting with the agent. So if you ask how many planets are there in the solar system, the model provides an appropriate response. Similarly, we can ask other questions and the model keeps providing appropriate and accurate responses, consistent with the examples we provided. Now, let's see how we can create a tuned model using our own database. We have the option to tune a model so it's better equipped for our use cases. Let's check it out. So here, we can choose our JSON dataset and set a location to store the dataset on the Cloud. After providing the dataset, we can tune the model details, and after that, we can tune the model based on our dataset and our settings. To decide which model would be the best fit for our specific use cases, we can check out Google's library of foundation models, which is available in Model Garden. In Model Garden, you can explore models by modality, task, and other features. With many different enterprise ready models at your disposal, Model Garden enables you to select the most suitable model depending on your use case, your expertise in machine learning, and your available budget. Okay, time to check out the speech models. Now it's time to explore the speech models. We can choose the speech either from the menu on the left or by clicking on the open under the speech box. Here, we have two different options, text to a speech or speech to text. Let's go to text to speech. Here, we can either provide the text or directly talk to the model. After providing the text, we have some options to choose different languages or set the speed of the speech. If everything looks good, we can click Submit. And now we have a synthesized AI voice that can read that text for us. Building an app is as easy as opening the generative AI studio in the Vertex AI section of the Google Cloud Console. Selecting the modality you want to work with, choosing your preferred format. For more advanced features like support for longer audio, we can use speech Studio and that's how the environment looks like. We also have speech to text. Here, we can either upload an audio file or record our own voice. And after providing the speech to the model, we can see that it turns it into text. So now I recorded my voice and I click Submit And here's my speech turned into text. We can also use the speeches Studio for speech to text applications. Both features, speech to text and text to speech are available in the speech studio. After you've customized your model, you have a few options. You can save the prompt to the prompt gallery. You can also deploy to Vertex AI's Machine learning platform for production and management. Or you can implement your newly tuned models directly into your website and applications. In conclusion, through Vertex AI's Generative AI studio, we can access language, vision, and speech models. Through language models, we are able to test, tune, and deploy generative AI language models. We can also access the palm or Cody API for content generation, chat, summarization, code, and more. With vision models, we are able to write text prompts to generate new images using the imagine API. We can also generate new areas for an existing image. And with speech models, we can convert speech into text using the chirp API. We can also synthesize speech from text using Google's Universal speech model or USM, and that concludes this video on generative Va studio on Vertex AI. 20. Project Demo- App Sheet No-code App Builder: In our introduction video on apsheet, we saw that through apsheet we can create custom apps without writing code. Recently, Google added GNAI capabilities, which makes us able to directly explain the type of app we need to apht it builds us a starter app based on that explanation. We can then further modify and customize the starter app only through chatting with apheet. Let's consider the following example. Anne Gray is a manager at a company and one of her responsibilities is to oversee the travel requests of her coworkers. These requests can come from emails, chat or in meetings, which can get pretty overwhelming. She wonders if the generative AI feature in apsheet can help her streamline operations by facilitating a solution for approving and tracking the request. To try it, she decides to explore the apsheet chat app available on Google Chat. Let's see how it works. In order to access this chat feature, we go to chat.google.com. Then we select Explore apps and find the apshet chat app. Now let's see how the process would look like from Ann's perspective. On the first page, we can see that apsheet welcomes the user and invites them to submit a description of an app or business problem they wish to solve. For example, by describing a workflow. An does this by briefly describing what she needs something to simplify the process of managers receiving and approving travel requests. She adds to the description by noting the kinds of data she will also need to keep track of after entering the prompt, Ushid responds with a general app schema. From Ann's first prompt, Upshd has recognized that the app should have an approval flow and asks her to choose how the notifications for approval request should be sent. As we can see, there are different options available here selects only email for now. Next, apshet suggests a few screens she may want to include, a form for users to submit new requests, a travel summary list, upcoming travel, and a few other views. These screens are basically the backbone of Anne's app schema. It describes what her app is all about. She doesn't want a M Travel screen in her app, so she diselects it to remove it from the app and then clicks App. Now that Apsheet knows what to put together for Ann's Travel request app, it confirms the tables that could be created in Apsheet database. These datasets are created based on the screens or app views that she just selected. Upsheet creates two tables to support Schema, travel and team. We haven't entered any data in our app yet, so all these tables will be empty. If we don't have any data in the app, then how can we test to see if everything works? Upset thought of that too. After creating the tables, Upseet provides the option to include sample data in the app. Anne is ready to test the app, so she picks yes. And finally, apseet prompts Anne to choose a name for her app. A calls this symbol travel, and that's it. Apshets next response is a link to a fully functional preview of the app that was just created. Let's take a break and look back at what we've done so far. Through just a few question and answer exchanges, apseet was able to take Ann's request, which was written in natural language, and I recommended several solutions, including the screens that her app users will need to see. The things they'll need to do and the place for the data to be stored. It's even set up the email notification to users. Creating an app through natural language with no coding is a magic that is now a reality in app sheeet. It enables many new users to develop applications rapidly and efficiently. Moving on, Anne is presented with the option to either preview the app that has been created for her or dive into the Ashoot editor for customization. She chooses to take a quick look at the preview first. As she navigates through the app emulator on her desktop, she explores the views that apse generated, starting from new travel, proceeding to travel by user, and finally upcoming travel. This last view displays both a map and a list of future trips all filled with the sample data that she decided to include previously. Everything seems to be in order so far. But Ann notices that a view she had in mind is missing from the app. She has a particular addition in mind, a screen that compiles all the travel requests into a comprehensive dashboard, providing the finance team with an answer to a question they frequently ask. What is the total cost for each employee's travel? In the editor, and notices the generative AI feature she used before is available here too. She types in her request for a new dashboard, and in no time Abscht takes her request, dissects it, and suggests the necessary components for this new edition. It proposes a new calculated column for her team table and gives a preview of the chart that will represent the aggregated data, just like she did before and wants to scrutinize every part of the suggested changes. So she checks out the preview chart. And it looks fine. Then she inspects the new column in the database to ensure all is well there. She uses the link provided to see the proposed change in the approval table in the asheet database. With a quick look at the numbers and confirms that the new view and the data changes align with her expectations. She approves the changes in the apsheet editor, and that's it. Her app is now live and ready to use and feels that she has got what she needs. Her confidence in the tables and columns apseet has created for her is high. Since she is happy with the functionality of her app, she gets rid of the sample data, deploys the app, and shares it with her team. Now her team can see this refined version of the app and start submitting their travel requests. Fast forward a few weeks while Anne is going through her company's intranet for a specific form, an idea strikes her. Anne goes back to the editor. Knowing how often her team uses Google Chat, she considers veraging Apsheets no code chat app feature. This would allow her team to fetch the required form simply by chatting with symbol travel. Anne goes back to the editor and she enables symbol travel as a chat app for her domains internal spaces. This step makes it feasible for Anne's colleagues to add symbol travel to their Google chat spaces, group chats, or even in private conversations. Now it's time to go over the settings. By default, symbol Travel chat app would display a list of all accessible app views to the users. But Anne is building this chat version specifically for end users. The employees that primarily want to use the app to submit travel requests. She chooses only the necessary app views for her users, which means she has to delete everything except request forms. Next, and adds a welcome message for her users, providing some context on how to interact with the chat up. She decides to include a slash Command. By adding this command, whenever a user types slash NETRIP the chat app promptly brings up the travel request form. Aphit also provides a smart search command. This command would enable her teammates to use Aphts natural language processing pipeline to search her app for data or views. But she decides to keep things simple and disables the smart search command. Her last task involves setting up an automation to notify users whenever their travel approval status changes. In this page, can create the right flow working with a graphical interface. This way, she can build the groundwork for the needed automation. Once it's done, she names her automation, tweaks a few details about when it should run and how responses should be threaded and returns to the chat app builder to wrap things up. Thanks to Apshts no code chat app deployment, and doesn't need to deal with any further configuration for her app or its automation to work in chat. Upshet takes care of all the Google Cloud platform configuration behind the scenes all with a single click. Now and is ready to share her chat app with the team. And there it goes. The chat app is now live and is ready to be installed and used by her entire organization. Now let's say Jeffrey Clark, a member of Ann's team decides to use the app. Jeffrey needs approval for his travel plans to visit the customer's premises. He has already installed the symbol Travel chat app, so he writes the slash New Trip Command to bring up the travel request form. Jeffrey inputs all the necessary details about his upcoming trip into the form and hit submit. From Anne's end, she can see Jeffrey's request showing up almost instantaneously. The new approval request triggers an email notification to Marcus Jeffrey's manager. Marcus receives an email detailing Jeffrey's travel request. After examining the details of the submission, Marcus goes ahead and approves the form directly from his Gmail. In a matter of seconds, Jeffrey notices a chat notification from symbol travel. What's the message is travel approval confirmation. Congrats Jeffrey and safe travels. So wrapping up, we witnessed the power of app Sheets generative AI feature. It helped Ann to build and customize a solution for managing her team's travel request using natural language and no code. An efficiently solved a business challenge, creating a travel request app fine tuned to her team's needs. The seamless integration with Google Chat and the smooth operation as shown in Jeffrey's travel request and Marcus's prompt approval underlines the platform's accessibility and efficiency. This is the power of no code development. APshetsGenerative AI is revolutionizing no code development, making it accessible, efficient, and intuitive.