Transcripts
1. Prod ready promo: What if I told you, instead of spending tens of
thousands of dollars to have someone answer your customer's
questions 8 hours a day, five days a week,
in the next 1 hour, you could learn how
to build a chatbot using the power of large
language models to do the same with higher accuracy and
do it 24 hours a day, seven days a week,
365 days a year. I'm Professor Reza, and I teach undergraduate and
graduate students topics on computer science and
artificial intelligence. I also have thousands of online students I
do research on AI, and I have collaborated with prestigious institutes
like MIT Media Lab, Carnegienon University,
Harvard University, and University of
California, San Diego. And those works
were published in venues like ACM
and Spring Nature. I'm going to use all of that experience and everything
else that I've learned so far to help you understand how you can use large
language models to build a customer service
chat spot that helps you answer the questions of your customers
anytime of the day. In the next hour, we're going to go over what large
language models are, and we're going to
talk about a lot of different topics that
are related to it, like PnetEngineering and ethics of using large language models. I will be talking about different platforms that are
going to be very useful for any application developer
or anyone who wants to learn how to develop applications with
large language models. We're going to talk
about platforms like hugging face and technologies like radio for making easy
interfaces for Python program. And I will watch you
through all the steps of creating an LLM based
chatbot that can fetch data from your business
documents and answer any question that your customers might have. Are you interested? So join me in the next video,
and I will tell you how.
2. Lesson1Video1- Exploring LLMs Advantages and Applications: In this video,
we'll explore what large language models
are and how they work. By the end of this video, you'll learn the key capabilities
and benefits of LLMs, as well as some
notable applications. LLMs are a type of deep
learning model that are portrained on massive
texture datasets. Then they are fine tuned
for specific tasks. They are called large because of two of their key
characteristics. One is that they are trained on enormous amount of data
in the scale of ptabtes. This gives them a broad
knowledge on languages. And two, they have a huge
number of parameters. We're talking about
trillions of parameters. This gives them a strong
ability in reasoning, including language
understanding and generation. In a nutshell, their extensive pre training
complemented by task specific fine
tuning makes them incredibly versatile and
powerful AI systems. LLMs go through two main stages. One pre training
the model ingest massive diverse datasets like
Wikipedia or Common Crawl, to build a broad
understanding of language. Two fine tuning. The pre trained model
is then customized for specific applications using smaller field specific datasets. This twister process enables LLMs to gain both
white knowledge from their general pre training
and specialized precision from their fine tuning. Now let's explore some of the major benefits and
capabilities of LLMs. One, they can understand nuance languages and generate remarkably human like texts. Two, they excel at
tasks like translation, summarization,
sentiment analysis, and question answering. Three, they can
be fine tuned for specific tasks only by
training on a small dataset. Four, they get better by more data and
larger model sizes. And five, through their
generative capabilities, they make AI more
accessible even to individuals with limited
technical knowledge. LLMs are the new powerhouses that can transform
various industries. Let's look at some examples. In healthcare, they
can be used for extracting insight from medical records or
research papers. In education, they are
capable of providing personalized tutoring and
feedback for students. In finance, you can use LLMs for analyzing earnings reports
and predicting market trends. And in entertainment,
LLMs help us with generating creative content
like stories or even scripts. And last but not
least, in retail, they can be used
for recommending products based on customer
data and reviews. There are a lot of exciting
possibilities ahead of us as LLMs
continue to evolve. Their versatility is enabling groundbreaking AI applications
across different sectors. In conclusion, large
language models represent a revolution
in AI capabilities. Their massive scale
pre training, followed by a
specialized fine tuning empowers them with exceptional
linguistic abilities. This is enabling
transformative impacts on industries from
finance to education. As these models grow
even more powerful, the future looks bright for democratizing AI
through versatile, acceptable large
language models.
3. L1V2- Understanding Prompt Engineering: This video will explore
prompt engineering, its role in utilizing LLMs, and how to prompt
effectively and responsibly. By the end of this video, you'll learn about
the different types of prompts and methods
for optimization, as well as their limitations. First, what exactly is
prompt engineering? It refers to the
strategic optimization of the prompts we
feed to AI systems. We do this with the aim to improve the performance
of these models. Prompt engineering employs
specialized techniques to produce the most accurate, relevant and useful
outputs possible. Prompt engineering is about communicating our
intent clearly. Prompts play a crucial role in shaping an AI
system's behavior. We can see them as
the interface between us and AI models to help us clearly communicate
our intentions and give directions
to the model. You may ask why prompt
engineering is important. Well, without the
strategic prompting, models have to interpret
ambiguous or vague instructions. This risks unhelpful or
even dangerous outputs. On the other hand, thoughtfully engineered
prompts allow for more precise control to tap into the AI's full capabilities. You may also have heard
the term prompt design. There is an important
difference between prompt design and
prompt engineering. Prompt design involves
tailoring prompts to specific tasks like
translation or summarization. Prompt engineering uses
specialized strategies to optimize the
model's performance. I can include techniques
like domain terminology, effective keywords,
examples, and other techniques to boost the accuracy and
relevance of the model. Now let's look at some
different types of prompts. Instruction prompting. These are straightforward
directive prompts. For example, summarize this text briefly or translate the
passage into French. There's also keyword prompting, which includes
using helpful cues. For example, please explain the key events in the
order of their occurrence. There's also domain prompting. It utilizes domain specific knowledge and
technical terminology. For example, diagnose this medical case using
clinical language, or assess this legal contract using legal
frameworks and terms. We also have role prompting. This type of prompting directs the model to adopt a persona. For example, respond as
an expert economist. Chain of thought prompting. It breaks down a complex prompt into a logical set
of actions or tasks. For example, briefly summarize
the article's key points. Then explain the
author's perspective. Finally, provide your
critical analysis. We also have shot prompting. Shot prompting provides
contextual setup before the actual prompt. We have zero shot, one shot, and few
shot prompting. In zero shot prompting, we don't provide any examples. For example, write a
short poem about nature. In one shot prompting, one example is given. For example, here's a
short poem about trees, and then we provide
a short poem. Then we continue now write a
short poem about the ocean. In fus shot prompting, multiple examples are provided. For example, here are two
short poems about weather, and then we go ahead and provide two short
poems to the model. Then we continue. Now write a short
poem about snow. FusiaPmpting is a
powerful technique. LLMs are pretty good
at following patterns. Actually, I have
a friend who was feeding Claude her old poetry, and it was asking it to write poems in her
own unique style. She was pretty impressed by it. So if you are also into
using AI for poetry, you probably should
check out Claude. Good prompt engineering can
give us a lot of power, and with great power comes
great responsibility. So let's check out some of the responsible and
ethical practices in prompt engineering. Consider potential biases
and limitations of LLMs. Validate high risk outputs like legal or
medical information. These should be validated
with subject matter experts. Make sure to iterate
carefully to optimize prompts
before deployment. We don't want to take an app to production stage before
testing it properly. Obviously, prompt
engineering comes with its own set of limitations. Let's see what some of
these limitations are. First, prompt engineering
is not a silver bullet. For example, it cannot
protect us against all unpredictable model
behaviors which can still occur. Second, even with
the best prompting, some tasks might be beyond the
capabilities of the model. We should also remember, in order to optimize the prompt, we should understand the model. It is very hard to come up with an efficient model
if we don't know how the model works and what type of prompt works
better for that model. And last but not least, no matter how well
engineered our prompts are, outputs still require
final human validation, especially for high
stakes fields. In conclusion, strategic
prompt engineering allows us to better direct AI systems and unlock their potential while
prompting responsibly. Acknowledging both the
power and limitations of prompt engineering
allows us to prompt effectively, responsibly
and ethically.
4. L1V3- How LLMs Are Deployed: In this video, we will explore how large language
models are deployed. By the end of this video, you will learn how
LLM development is different from traditional
machine learning. Examine the three
different types of LLMs, and discuss the chain of thought reasoning concept to
design better LLM prompts. Okay, let's dive deeper. Traditional machine learning involves extensive
technical expertise, massive training data,
complex model training, and some hardware knowledge for computing
resource management. LLM development instead centers on crafting effective prompts, no technical
background required. The focus is on clear, concise, and informative
prompts, rather than the details and nuances of
model design and training. There are three
primary categories of LLMs based on
their functionality, generic, instruction
tuned, and dialect tuned. Generic language models. These models predict
the subsequent words like an autocomplete feature. Their predictions
are purely based on linguistic patterns they
find in their training data. Instruction tuned models
generate responses based on specific instructions embedded in the input prompts. They can summarize,
compose poetry, analyze sentiments,
and a lot more. Dialect tuned models
are specialized subsets of LLMs focused on
conversational interactions, something we commonly
see in chat bots. This is the type of LLM that I chose to demo in this course. An interesting concept in LLM is chain of
thought reasoning. In chain of thought reasoning, the model generates
a reasoning chain before concluding and
providing an answer. It's similar to how we
break down a problem into smaller pieces in order
to understand it better. To get a better idea, let's ask Chat GPT
for an example. So I went ahead and
asked CHAT GPT. Give me an example
of how an LLM can process a complex prompt using chain of
thought reasoning. This is the response
that it provided. So the prompt is, imagine a city where all the buildings are made of
transparent material. How would this affect the
lives of people living there, the energy consumption
of the buildings, and the overall urban design. So this is what would
happen in the background. The LLM would break
the problem into smaller pieces and try to look at it from
different perspective. For this example, it could
be the relationship between transparent buildings
and privacy or light and energy consumption, or aesthetics and
urban atmosphere, or safety and security. So after considering
all these perspectives, the model goes ahead and
produces a final response. In this example, the LLM breaks down the prompt
into different aspects, for example, privacy, energy consumption,
aesthetics, and safety. Then it analyzes the effects on people and urban
design for each aspect, and finally combine
all this information to provide a
comprehensive response. Now let's look at some key considerations for crafting effective LLM prompts. Frame prompts as clear
concise instructions tailored to the model. Leverage the models
strengths and limitations. Start with simple prompts, increase the
complexity gradually and keep experimenting to learn optimal phrasing and
structures that work better. In conclusion, LLM
development is different from traditional machine
learning by prioritizing well designed prompts over
technical complexities. LLMs come in three
main varieties, generic, instruction
tuned, and dialect tuned. And it's important to know that concepts like chain
of thought reasoning enhance LLM capabilities
to generate more accurate and
coherent responses by systematically working
through the steps of a problem or argument.
5. L1V4- What Production Ready Means: In this video, we'll explore
the critical components for developing a production
ready LLM based application, an application with real world reliability
and scalability. By the end of this video, you will learn about
application performance, scaling, reliability,
and security. Deploying LLM powered apps takes more than
just the AI itself. To build production
ready LLM applications, certain key practices
are crucial. First, the application
needs to be efficient. This means it can handle
real world traffic and usage without slowing
down or crashing. Stress testing the
app early on helps us simulate high usage and find
performance bottlenecks. Second, is scalability. The infrastructure
should scale up or down automatically
based on demand. Using cloud hosting and containers enables quick
scaling of applications. Third, the app must be
reliable and stable. There should be thorough
testing to catch bogs, plus monitoring in
production to track crashes. A robust error handling system ensures the app gracefully
handles any failures. Fourth is the ease of
deployment and updates. For example, automated pipelines allow fast and
repeatable deployments. Fifth is operational visibility. We can do that through
metrics and logging. These logs gives us insight into usage patterns and errors. And finally, security is a must. Data should be encrypted
and access controlled. Tests like vulnerability
tests, identify risks, and protections like rate limiting defend against attacks. The Hugging phase
platform provides many of these capabilities
out of the box, making it easier to build
production ready applications. Hugging face models are optimized for performance
and scalability. The inference API handles
traffic spikes gracefully and security features like authentication and
encryption are built in. And that's why we run the demo for discourse
on Hugging pase. In conclusion, following the best deployment
practices results in an LLM powered application
that is efficient, scalable, reliable, deployable,
observable, and secure. This does take additional
engineering efforts, but is essential for real
world production readiness. And with diligent engineering and platforms like Hugging Face, production ready AI
is within reach.
6. L2V1- Getting Familiar with HuggingFace Platform: In this video, I'm going to talk about the Hugging
Face platform. Hugging Face is a
community like Github, but for AI developers. Their most notable product
is the Transformers library. These libraries provide many different functionalities
like classification, translation, and
question answering. There are also a lot of user contributed models that
can be used for image, video, and sound generation. Hugging face is an
open source platform, meaning that developers
from all around the world can contribute to these
models and datasets, and they can improve upon all these new AI technologies
that are available now. With this approach, Huggingface
is lowering the barrier for entry for developing
intelligent applications. So there are three different
components in hugging face, which are distinct but
also interconnected. So we have models. We have
datasets, and we have spaces. Let's have a look at models. So these models are pre trained machine learning
and large language models, which users can clone
into their own workspace, and they can customize it
or even improve upon it. There's also a repository
of datasets which is used for training and
evaluating the models. Users can also contribute and add their own datasets
to the platform. We have spaces which are relatively newer addition to
the Hugging face platform. Using spaces, users can create, share and explore interactive
web applications. These spaces provide the
capability to interact real time with the models that are available
on Hugging face. And similar to the models, you can clone each
of these spaces and customize or improve
it however you want. Creating a workspace on
Huggingface is pretty easy. So you just have
to create signup, and then with an email
and setting a password, we can create an account. So this is how my
workspace looks like. I can go to my profile and
I can see all the spaces, models and datasets that I have. So here I don't have
any models or datasets, but there are a few spaces
that I've been playing around. Actually, this FAQ chatbot is the application that I'm going to demo for
you in this course. So right now this space
is asleep because all of these spaces and all of the models are using
actual hardware. So when we're not using them, they go to sleep to
save costs both on huggingface side and
on our own side. In each of the spaces, we have the ability to
check out the files. So this is the
repository of the space, and we can modify and
customize each of these spaces by editing the code inside
the app dot py file. There's also a
community feature. In this community feature, we can create new discussions and interact with
other developers. We can learn from them and we can also help our
fellow developers. And we can also access the
setting for our space. And here we have the option to improve the
hardware we're using. We have a selection of
different CPUs and GPUs, and we can also set how much storage we want
to use for our space. There are also more settings
like restarting the space or changing its visibility
from private to public or the
other way around. We can also set different
variables like different APIs, which we will discuss in the future videos
in this lesson. So in conclusion, in this video, we introduced the
Huggingfas platform to you, and we talked about
the importance of open source in rapid
development of AI applications. And we also covered some
components of Hugging pace, which are models,
datasets, and spaces.
7. L2V2- Creating Web Interfaces Using Gradio: This video we'll explore
radio, a Python library, which allows for
interactive demos with only a few codes in Python. I'm going to walk you through different components that can be used in a radio interface, and I will show you some
hugging face spaces as real world examples of what these interfaces are capable of. Let's explain how gradio works with a hello
world example. Here in the gradio website, we can see instructions
for how to install gradio. It's pretty simple. It's only with one line of command line. I'm going to skip
through here because I want to focus on what
this interface offers. So by looking at the code here, we see that we
have an interface, a gradio interface that has a text input
and a text output. As we can see here, there's
an input called name, and there's an output that can generate the output for us. So if I enter my name here
and then click Submit, it will show a message
which is greeting. So the interface class
that you're using has three different parameters.
Let's check them out. So the first one is FN, which is the function to wrap
the user interface around. We also have inputs and outputs. Each of these inputs and outputs can be of
different types. For example, they can be text, image, audio, video, and more. We can also set
different attributes for each of the components. So for example, here
in this textbook, we can have two lines
instead of one. So now here, this input textbook has a height of two
lines rather than one. We can also have multiple
inputs and output components. For example, here, we
have a Grit function which has different inputs
and different outputs. So this is how the
interface looks like. We have a text input. We have a checkbox input, and we have another input, which is a slider to
set the temperature. So I can set the temperature. I can enter my name here. I can enter my name here. And let's say it's not morning. So if I click Submit, it says, Good evening, Reza. It's 70 degrees today. And down below, there's
another output, which is a conversion of
Fahrenheit to Celsius. We can also use
image components. So this is how the code
for it looks like. So right now inside
the radio app, the component that
is used for image is giving an error, but
that's not a problem. That's why we have hugging face. So let's go check out a
hugging phase interface, which uses an image component. So this space is called
illusion diffusion. And what it does is that based
on one of these patterns, so let's say we
pick this pattern, I can create an optical illusion based on the prompt
that we enter here. So let's use the
same prompt here. Let's do a medieval village. And I click Run. So now, it created an image of a medieval village following this optical illusion pattern. So if I zoom out from here, hopefully now it's
easier for you to see the pattern in
the created image. All right, back to gradio. Another feature that we can
use in gradio are chatbots. This is how the code looks like. So in this chatbot, we're just generating
a random answer, which would be which would
be either yes or no. But in the real world scenario, we're going to use a
large language model to generate proper responses
based on the user's prompts. So for example, here, I can say hi, it says, no, let's say, how are you? So for now we're just getting
a random no or yes answer because that is the only answer that the chatbot can generate. Later on, we can add a
lodge language model to help us generate an actual chat interaction
with the user. In gradio, we can take advantage
of the blocks feature, which gives us more
flexibility and control. So traditionally, we can
use either interface or a chat interface to interact with a model
through radio library. But using blocks, we can create different blocks and put different components in
each of those blocks. It allows for more
complex interactions between different
components in gradio. Let's have a look at an
example of using blocks. So here we are creating a block. Inside this block, we
have two text boxes. We have a button, and
we can also assign a function to the click
functionality of the button. So whenever that
button is clicked, the grid function is called, and it will pass these
parameters to the function. So now we can see that all of these components
are in one block. And this is how we can add
more complexity to our blocks. So in this code, we can see
that we are creating a block, but we are also creating
two different tabs. And inside each of these tabs, we have different components. So let's see how the
interface looks like. So now we have a block, and there is one tab here and
there's another tab here. So in this tab, we have an input image and an
output image component. But in the first step, we have an input text and
an output text. Down below, we also
have an accordion menu. We can close it and open it. And inside here, we can add
more components as we need. So that was a brief overview
of what radio can offer us. Now let's have a look at some real award examples
in Hugging Face.
8. L2V3- Building the FAQ Chatbot Initial Steps: This video, we'll get started on building a customer
support chatbot. We'll go over the files needed to run the
space on inning face, and we will also
review the Python code we need to have this
chatbot work for us. This is how our customer service assistant chatbot looks like. To have this space on
our own workspace, we can either click
on the three dot here and click on
Clone repository, or we can start a new space. So for that, we need to
go to our workspace. And from here, I can click on my profile picture,
go to New space. I can select the
name of the space, choose the license that I want. We also need to pick an SDK, a software development kit. So in our case, we
want to use radio. We have an option to
select the hardware for our space and decide whether we want it to
be public or private. Once we're done, we can
click on Create space. I already have this space, so I don't need to create
it. Let's get back to it. Now let's check out this
chat bot and see how it works. Let's say hello. Yeah, sure. Hello there. I'm the chatbot from the
Imaginary Mechanics Shop. I'm here to answer
any questions you may have about our services.
How can I help you? So let me ask it. Tell me
about the history of the shop. And the chatbot provides
some information about when the workshop was founded and how many years the mechanics
have been working there. Let's ask about the
operating hours. And it will respond properly with the operating
hours of the shop. What services do you provide? So it's tell us about different services they
provide like changes, brake repairs, tire rotation, et cetera. All right. Now, let's go into
the files and see what we need to have in order to make this chat bood work. So the first file
that we want to see is the Git attributes. This file is configuring Git
large file storage or LFS, which is an extension to
Git that enables you to efficiently manage large
files and binary assets. This configuration
can be used for a variety of file
types and paths, specifically targeting
binary files and large datasets commonly
used in machine learning, data science, and
software development. This helps to keep the
size of the Git repository manageable and improves
the performance for cloning and
fetching changes. The next file is the
imaginary mechanic shop CSV. So here we can see
different questions and answers about our
imaginary mechanic shop. This is the file that the Large language model
will use as a reference, and it will be able to
respond to any question that can be answered based on the information
provided in this file. We also have a
read me file which provides different information
about the application, the software development kit
versions, and the author. The requirements that TXT file is commonly used
in Python projects to specify a list of
dependencies that need to be installed for the
project to run properly. Each line in the file
specifies a package and optionally a version or a range of acceptable versions
for that package. And finally, there
is the app Pi file, which has all the
code we need to run the application on
the hugging phase space. So let's dig deeper into
the Python code itself. In the beginning, we are
importing different libraries. We're importing radio
for the user interface. We're importing Open AI for
the large language model, which powers our chatbot, and we are also importing OS, CSV and JSON for file handling. We're also setting the
API in a encrypted way. This has to do with the security because we don't want to have our API key be visible
inside the code. I will explain this in the
next video when I talk about best practices in developing
LLM powered applications. So first of all, we want to define the CSV file input path. So the Large language model knows where to access this file. Then we initialize an empty
list to store the data. Then we open the CSV
file for reading. We create a CSV reader object and we iterate through the CSV data and
append it to the list. Next, we convert the list of dictionaries to a JSON string. Our respond function is the function that
takes the message from the user and generates a proper response based
on the input text. So we set up our JSON file. We provide a guideline for the chatbod in order to tell it how to behave and
how to response question. This also is another
part that I'm going to dig deeper into
in the next video, as it has to do with prompt
engineering and making sure that our chatbod
produces proper responses. Then in order to
produce a response, we call openai dot completion
dot create and inside here, we can decide which
engine to use. So here we are using
take Deven G 03. We identify what our prompt is, and we can set different
settings for the model. For example, I'm setting
the max token to be 300 and the
temperature to be 0.1. Next, we extract and
print the generated text. And at the end, we are
creating a radio block. So in that block, we
have the chatbot. We have a textbox
for user's message, and we have a clear bottom. Whenever the user clicks submit, we call the respond
method by passing the message and the chatbot
history up to that point. And in order to
launch all of that, we just write demo dot launch. So in conclusion, in this video, we got started on building
a customer support chatbot. We went over the files
needed to run the space on HigingFace and we reviewed the Python code we need for
building the application.
9. L2V4- Completing and Deploying the FAQ Chatbot: In this video, we're
going to show you how to deploy a chatbot to
a scalable endpoint. We will also discuss how to
apply ethical considerations and other production
best practices to your chatbot development. In order to do so, we need to go to the Settings
app in our space. We can see that we
have options for different processor
units and storage. There's also options for restarting or
rebooting the space. By changing the
space visibility, we can shift between making
the space private or public. Let's talk more about
the API key and other sensitive
information that needs to be stored in our
hugging face platform. So if your app requires environment variables,
for example, secret keys or tokens, do not hardcode them
inside your application. Instead, you can go to
the setting page of your space and add a
new variable or secret. Use variables if
you need to store non sensitive
configuration values and secrets for
storing access tokens, API keys, or any other
sensitive value or credentials. Under the Settings app, we have also other options like renaming or transferring
this space. We can also enable or disable the community
contribution feature or delete the space if you
don't want to have it anymore. Another point of interest on
their settings is web hooks. So web hooks in Hugging Face allow you to set up
automated responses or any other actions
that could be triggered by specific events on the hugging face platform. You might set up a
web hook to notify your system when a new version
of a model is available on Hugging face or when a training job that you started on the
platform is complete. Webhooks are like
automatic reminders that help us follow
good practices. They act behind the scenes to check our work whenever
we make changes, ensuring everything
fits together. They can facilitate a smooth and successful collaboration
by saving us time, keeping our project consistent, and keeping the whole team
informed of new changes. Now let's look at some prompt engineering
practices to help our chatbots act properly and
provide relevant answers. Okay, so inside the
app dot Pi file and we're providing some
chatbot guidelines. Let's break it down. So this is the breakdown
of the guidelines. Let's see how these
instructions help the chatbot provide appropriate
and relevant responses. The first thing
that we want to do is to give the chatbot a role. So we can say, you are a
conversational chatbot, acting as a mechanic at the
imaginary mechanic workshop. Then we need to
give it a function. Your primary function
is to answer all questions which are the messages provided
by the user smoothly. Respond to inquiries strictly related to the content found within the
provided document, which is in the JSON
file that we created. Also to help the chatbot, we can provide an example. The user may use the word to you as a
representative of the shop. So if the user asks, Do you fix flat tires, your answer should
be something like, yes, we fix flat tires at
the imaginary mechanic shop. We should also set
limitations for the chatbot. Your responses have limitations. Do not engage in discussions or answer questions concerning
illegal activities, explicit content,
or any topic not related to the mechanic shop
or fixing cars in general. Stick solely to the
information available in the designated file and questions that can be answered
using that information. We can also provide
instructions for handling inappropriate
or unrelated prompts. You should be able to handle inappropriate or
off topic queries. If the question is
completely off topic, politely inform users
that you can only provide assistance and
answers concerning the imaginary mechanic shop. Refraining from engaging in irrelevant or
inappropriate topics. If the question
is not off topic, but you do not have an answer, please provide a short
response to the question and ask the user to call
the shop for more info. And finally, we can tell the chatbot what the tone of
the interaction should be. Mintain respect and
professionalism. Ensure interactions are
polite, constructive, and on topic, maintaining a professional and
respectful user experience. Providing these instructions
help us create a chatbot that stays on topic and
provides appropriate responses. So in conclusion, in this video, we explored how to deploy the chatbot to a
scalable endpoint. We also learned how to apply
ethical considerations and other production
best practices to your chatbot development.
10. L3V1- Key Ethical Issues in LLM Applications: In this video, we will
explore the risks involved in deploying ethical and
responsible AI systems. By the end of this video, you will learn about the key
ethical risks concerning building production ready
applications using LLMs. The topics that we cover
in this video are biases, unsafe responses, transparency
and explainability issues, and the risks of
wrong expectations. A major concern is
potential bias in the training data that gets encoded in the
model's behavior. Models trained on text
from the Internet may inadvertently amplify the harmful stereotypes
around race, gender, or other attributes. This could lead to
discriminatory outputs that misrepresent
the real world. Another issue is
the likelihood of LLMs for occasionally
generating toxic, unsafe or untruthful content, which are also referred
to as hallucinations. Without proper
control, this could have dangerous real
world consequences. There are also transparency
and explainability issues around large neural
network models. It can be unclear
why an LLM produces a specific output or recommendation from
a human perspective. This black box nature makes it hard to audit the
model's output. In addition, the human like nature of conversational
models can lead to misplaced user trust or attachment to an AI system. Managing expectations
around capabilities of the system is an important task in presenting our production ready LLM
powered application. So during my PhD, when I was studying
the trust between humans and artificial agents, I found out that the
overall trust for an agent, including the level
of forgiveness of the user for that agent
when a mistake happened, is significantly dependent on the initial perception and
expectations of that agent. So in order to get off on the
right foot with your users, when introducing your
application to them, it is important to set
realistic expectations. Acknowledging these risks and probably more risk that we may not be aware of means
that as AI practitioners, we have an ethical obligation to proactively
address these type of challenges through
research, design, and testing. And it's important for new
developers to look out for potential risks and
ethical guidelines produced by AI scientists. In conclusion, some
potential risks in LLM powered apps are bias, unsafe responses, transparency
and explainability issues, and misinformed expectations. With diligence and care, we can utilize the
power of LLMs for good while controlling for
these potential risks.
11. L3V2- Strategies to Minimize AI Risks: In this video, we'll explore strategies for bias
mitigation and safety. By the end of this video, you will learn
healthy application practices for addressing bias and safety issues throughout the
development life cycle. The practices that we'll cover are curating diverse datasets, monitoring the training process, flagging inappropriate
outputs and behaviors, safety focused, prompt
engineering, and testing. Before starting training an LLM, a key step is to
carefully curate a diverse dataset to be
used to train the model. Data should come from
reputable sources and be screened for
harmful content. In order to fight bias, we can use features like diversity filters to help
remove skewed distributions. We can also use augmenting data, which is another
technique to include underrepresented
perspectives in the dataset. Next, the model can be
monitored for fairness across different
demographic groups during the training stage. We can use bias
mitigation algorithms, which can adjust
model parameters to reduce in equitable
performance. For deployed models, techniques like
likelihood ratcheting and content flagging
filters can detect and reduce the generation of
biased or toxic outputs. It's important to
know that monitoring is not only for the
training stage. Ongoing monitoring helps with identifying the emerging issues that need to be addressed. In addition, safety focused
prompt engineering teaches the model acceptable
conduct norms and guides it toward a benevolent
behavior when uncertain. Developers can refer to developer guidance documents to help them establish
these healthy practices. And last but not least, testing is also a
very important stage for mitigating bias. It should be done throughout the development life
cycle to analyze the model outputs for bias and safety issues before we release the application
for production. In addition, audits by external researchers can add
another layer of oversight, which is crucial for high
stakes applications. In conclusion, no
approach is perfect, but combining best practices like curating diverse datasets, monitoring the training process, flagging inappropriate
outputs and behaviors, safety focused
prompt engineering, and testing help us create
the checks and balances needed for ethical and
representative LLM applications.
12. L3V3- Promoting Transparency in AI Systems: This video will explore
how transparency and explainability benefit
LLM applications. By the end of this video, you will learn different
strategies for improving the transparency and explainability
of your applications. We will be covering visibility
into training data, local explanation techniques,
confidence scores, user testing, and
human oversight. Large language models can produce impressively
human like outputs, but the inner workings of
the neural networks are complex and a lot
like a black box. One approach to increase transparency is to
provide visibility into what training data and parameters were used
to train the model. So sharing model cards
is a good practice for providing information
about the development process. Another way to increase
transparency is through methods known as
local explanation techniques. They can help users understand which part of the
input, for example, users prompt played
a significant role in arriving at the
provided output. By highlighting or pointing
at the specific sections of the input that had a major influence on
the model's output, users can get a
clear understanding of why the model responded
in a certain way. We can also flight on certain responses to
provide more transparency. We can use confidence scores to help indicate
when the model is likely guessing versus when it is highly certain
about an output. And there is user testing, which helps identify cases where the model's logic is vague and fails to meet
the expectations. Logging these
instances can serve as a guide for future
model improvements. Ultimately, we shouldn't forget that human oversight is still required to verify
model rationale and overrt incorrect decisions. Complete autonomy
should not be given to LLMs without guardrails. Okay, we saw that
these techniques can help us build more
transparency around the model. We should note that any
explanation should be tailored based on the audience's
technical background. For developers,
detailed technical explanations could
be preferable. But for end users, usually simplified
interpretations of the model's intent are enough. In conclusion, transparency and explainability help
building a lasting trust, ensuring our app
works as intended. Features such as visibility
into training data, local explanation techniques,
confidence scores, user testing, and human
oversight help with building more transparent and explainable LLM applications.
13. L3V4- Techniques for Sustaining User Trust: This video we'll explore some of the challenges
in building trustworthy interactions between human users and
LLM applications. We will go over
important tips for maintaining user trust and
upholding communication norms. As we discussed before, expectation setting is crucial. We must clearly convey the model's capabilities
and limitations so users understand
when to trust the outputs and when
to seek human insight. We should make sure that
we don't over promise. The user interface itself is also influential
in building trust. Human like design
elements, for example, using a human like avatar
or voice can mislead users into thinking
the system is more intelligent
than it really is. Minimalist user interfaces
help users focus on the task. Establishing a
consistent character and voice for the
model helps with aligning user expectations and avoiding disorienting
personality shifts. We can benefit from
user testing to identify these problematic
inconsistencies. In addition, sticking to expected communication
norms avoids any confusion. To support a
productive dialogue, the system should follow
conventions like turn taking, clarifying ambiguous requests
or admitting its ignorance. Transparency is also
very important. Transparently, disclosing
the LLM's role and providing information about
its training data gives users appropriate context. Also, explaining its limitations helps with building
credible trust. And allowing user
feedback helps with identifying failures of trust
or communication issues. We should continuously monitor
interactions and apply user feedback in the
future iterations to improve the
relationship over time. In conclusion, with thoughtful
design and transparency, LLM developers can craft
systems worthy of users trust. These considerations
reduce the risk of unintended consequences
and biases in LLM based applications. Prioritizing these
factors is crucial for creating trustworthy and universally applicable
AI solutions. I hope you enjoyed
this short course with me and found the
content valuable. Hopefully I will see you soon in another exciting generative
AI course. Keep exploring.