Hugging Face Course for Beginners | Amit Diwan | Skillshare
Search

Playback Speed


1.0x


  • 0.5x
  • 0.75x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 1.75x
  • 2x

Hugging Face Course for Beginners

teacher avatar Amit Diwan, Corporate Trainer

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

    • 1.

      About Course

      0:45

    • 2.

      Hugging Face - Introduction and Features

      3:22

    • 3.

      Hugging Face - Use Cases

      7:01

    • 4.

      Transformers Library of Hugging Face

      4:21

    • 5.

      Datasets Library of Hugging Face

      5:08

    • 6.

      Tokenizers Library of Hugging Face

      4:45

    • 7.

      Hugging Face Access Token (API Key) & How to Create

      5:21

    • 8.

      Download a dataset on Hugging Face

      3:11

    • 9.

      Download a model from Hugging Face

      2:38

    • 10.

      Sentiment Analysis using Hugging Face

      6:31

    • 11.

      Text Classification using Hugging Face

      5:43

    • 12.

      Text Summarizations using Hugging Face

      3:48

    • 13.

      Text to Text (Translate) using Hugging Face

      3:41

    • 14.

      Question Answering using Hugging Face

      2:36

    • 15.

      Text to Image using Hugging Face

      4:10

    • 16.

      Text to Video using Hugging Face

      6:00

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.

10

Students

--

Projects

About This Class

Welcome to the Hugging Face course. Hugging Face is a company and open-source community that focuses on natural language processing (NLP) and artificial intelligence (AI). It is best known for its Transformers library, which provides tools and pre-trained models for a wide range of NLP tasks, such as text classification, sentiment analysis, machine translation, and more.

Hugging Face – Features

Here are some of the features of Hugging Face:

  • Transformers Library: A comprehensive library that includes thousands of pre-trained models like BERT, GPT, T5, and others, which can be fine-tuned for specific tasks.
  • Model Hub: A platform where users can share and download pre-trained models, datasets, and other resources.
  • Datasets Library: Provides easy access to a wide variety of datasets for NLP tasks.
  • Spaces: A platform for hosting and sharing machine learning demos and applications.
  • Inference API: Allows users to easily deploy and use models in production environments.
  • Community and Collaboration: Hugging Face fosters a strong community of researchers, developers, and enthusiasts who contribute to the ecosystem.

Course Lessons

✔️ Hugging Face Overview

  • Hugging Face - Introduction and Features
  • Hugging Face - Use Cases

✔️ Hugging Face Libraries

  • Transformers Library of Hugging Face
  • Datasets Library of Hugging Face
  • Tokenizers Library of Hugging Face

✔️ Hugging Face Access Token (API Key)

  • Hugging Face Access Token (API Key) & How to Create

✔️ Working with Datasets and Models

  • Download a dataset on Hugging Face
  • Download a model from Hugging Face

✔️ Use Pre-Trained Models with Hugging Face

  • Sentiment Analysis using Hugging Face
  • Text Classification using Hugging Face
  • Text Summarizations using Hugging Face
  • Text to Text (Translate) using Hugging Face
  • Question Answering using Hugging Face
  • Text to Image using Hugging Face
  • Text to Video Synthesis using Hugging Face

Who this course is for:

  • Those who want to begin their AI journey
  • Beginner AI Enthusiasts
  • Learn to use the pre-trained models on Hugging Face
  • Those who to generate images from text prompt
  • Those who want to generate videos from text prompt
  • Learn to translate text using pre-trained models
  • Analyse sentiments with a pre-trained model

What you'll learn

  • Learn Hugging Face from scratch
  • Understand the Hugging Face use cases
  • Understand pre-trained models on Hugging Face
  • Get to know the Datasets on Hugging Face
  • Learn to work with the Transformers library
  • Learn to work with the Datasets library
  • Learn to work with the Tokenizers library
  • Text Summarization with Hugging Face
  • Translate Text with Hugging Face
  • Text-to-Image with Hugging Face
  • Text-to-Video with Hugging Face
  • Question-Answering with Hugging Face
  • Text Summarization with Hugging Face

Meet Your Teacher

Teacher Profile Image

Amit Diwan

Corporate Trainer

Teacher

Hello, I'm Amit,

I'm the founder of an edtech company and a trainer based in India. I have over 10 years of experience in creating courses for students, engineers, and professionals in varied technologies, including Python, AI, Power BI, Tableau, Java, SQL, MongoDB, etc.

We are also into B2B and sell our video and text courses to top EdTechs on today's trending technologies. Over 50k learners have enrolled in our courses across all of these edtechs, including SkillShare. I left a job offer from one of the leading product-based companies and three government jobs to follow my entrepreneurial dream.

I believe in keeping things simple, and the same is reflected in my courses. I love making concepts easier for my audience.

See full profile

Level: Beginner

Class Ratings

Expectations Met?
    Exceeded!
  • 0%
  • Yes
  • 0%
  • Somewhat
  • 0%
  • Not really
  • 0%

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.

Transcripts

1. About Course: In this video course, learn hugging phase and its concepts. Hugging Face is a company and open source community that focuses on natural language processing and artificial intelligence. It is best known for its transformers library, which provides tools and pre trained models for a wide range of NLP tasks, such as text classification, sentiment analysis, machine translation, and more. In this course, we have covered the following lessons with live running examples. Let us start with the first lesson. 2. Hugging Face - Introduction and Features: In this lesson, we will learn what is hugging face. With that, we will also understand the features. Let us start. Hugging Face is a widely known company and open source community that focuses on NLP, that is natural language processing. It also focuses on artificial intelligence. Hugging Face is best known for its transformers library that provides tools and pre trained models. So that wide range of NLP tasks such as sentiment analysis, machine translation, text summarization can be performed. The most widely used hugging face libraries are transformers, datasets and tokenizers. Let us see the features. It includes lots of libraries. One of the key library is transformers. That includes pre trained models such as BT. Hugging face also includes Model Hub. That is a platform where users can share and download pre trained models. With that, users can also download datasets and other resources. Hugging Face also includes a library for a variety of datasets. The library is called Datasets Library and is used for NLP task. Hugging Face is also having a platform for hosting and sharing machine learning demos and applications, which is called spaces. With that, using Hugging Face, you can easily deploy and use models in production environments. Hugging Face is having a strong community and collaboration. That is a community of developers, A lovers that contribute to the ecosystem. Let us see some of the popular models on Hugging Face. The widely used bird it is used for understanding the context of words in a sentence. Its full form is bidirectional encoder representations from transformers. It is a powerful open source machine learning framework developed by Google for NLP. It excels at understanding the context of words and sentences by analyzing relationships between them in a bidirectional manner that allows computers to better understand the meaning of ambiguous language. It also includes PT that is generative pre trained transformer, also text to text transformer with that robota that is a robustly optimized bot approach. Robota is a transformer based language model that employs self attention to analyze input sequences. Robota applies dynamic masking where the masking pattern is changed. It offers enhanced performance on various NLP tasks. So you can also relate BT with Robota. Consider that the main goal of the Robota model is to improve the performance of the bord model by addressing its limitations. These were the popular models of hugging phase. In this lesson, we saw what is hugging phase, its introduction features, and some popular models. Thank you for watching the video. 3. Hugging Face - Use Cases: In this lesson, we will understand the use cases of hugging phase. Hugging Face supports a wide range of use cases across NLP, computer vision, and even multimodal applications. Let us see the use cases. Hugging phase is widely used. We have discussed some key use cases here, beginning with conversational EI, which you already know, that is chatbds. Okay, build intelligent chat boards using models like GPT, Blender board, and others. These chat boods can be used for customer support, like virtual assistance. With that, you can also create interactive dialogue systems. These can be used as educational assistance as well as therapy boards. Next come sentiment analysis, as the name suggests, you can easily analyze customer feedback, analyze their social media posts or the responses to survey so that you can determine the sentiment that it is positive, negative or neutral. With that, easily generate text, generate articles, blogs, even poems using models like GPT. Quotes can also be easily generated, generate code snippets in any programming language. With that, content can be generated, including product descriptions, review, marketing plans, and others. Next comes text summarization. If you want to summarize your text, let's say you want to summarize news, you can easily do it. With that, let's say you have some PDF documents and you only want to summarize it. Those lengthy documents can be easily summarized into important points. With that, you can also jot down meeting notes. Then comes your named entity recognition, easily extract names, skills and experience from resumes. It is also useful in healthcare to identify diagnosis, the name of patients, some medical terms and others. With that, you can also extract name of companies, how they are working from their financial reports as well. Therefore, it is also used in the finance domain. Machine translation, as the name suggests, you can translate your website app and even documents from one language to another, let's say, from English to Spanish. It can also be used in languages which have low resources. That is for translation. Question answering use case is mostly useful for customer support. With that easily answer the questions asked by students based on a specific textbook or notes. FAQs can be easily answered and when I said customer support, that itself means like we have seen support tickets on websites so that users can easily ask questions. With that, you can easily retrieve answers from large documents or databases. Also use it for speech recognition and synthesis. You can also convert speech to text, build voice control applications using speech to text, and text to speech models. You can also provide real time captioning, generate descriptions for images. Let's say you have scanned document or images and you want text from it, you can easily achieve this. Also, if you want to read or scan images, that can also be achieved. That means visual answering. Then comes your recommendation systems. You must have seen it on Netflix or Amazon Prime. Easily recommend movies or web series using what people are actually liking in their account. With that NAS the search results by understanding the intent of users and their context. Also detect frauds easily. With that regarding emails, easily detect and filter Sam emails, mental health can also be monitored using a model, easily analyze text or speech so that the emotions can be detected, such as stress, anxiety, or even depression. With that, understand the emotions of customers during support calls or even chat. Text to speech and speech to text can also be achieved and real time translations can be easily worked upon. Multimodal applications analyze video content easily. With multimodal applications, you can easily analyze, not even text, but also video and audio. Augmented reality applications can also be built easily generate synthetic text data for training, machine learning models. You can also paraphrase text, also identify relationship between entities in text, grade the essays or assignments of students easily, build tools for grammar correction, vocabulary, rephrasing content, and others. Use in healthcare and life sciences. From the medical records of a patient, you can easily extract the insights. From legal documents, easily extract the key clauses, obligation or any possible risk. With that, you can also create AI driven narratives for games. Analyze social media easily so that you can identify the trending topics or even hash tags. Also analyze the impact of post done by influences. Multilingual applications can also be created so that you can enable search across multiple languages. Hate speech and harmful content is something that needs to work on. With this, you can easily detect and moderate them. Easily predict the stock market trends by analyzing news articles, social media sentiments, Twitter posts and others. Predict events like a product launch, personalized email content for marketing campaigns, customize website or app content based on user preferences and behavior. Fine Tune pre trained models for your specific task. Also, you can compare the performance of different models on custom data sets. So, guys, we saw some of the great use cases of hugging phase. In the upcoming lessons, we will implement some of them. 4. Transformers Library of Hugging Face: In this lesson, we will understand the transformers library of hugging phase. We will also learn how to install it. Let's see. The transformers library is the core library for pre trained models and pipelines. It is an open source Python library. As I said before, Hugging phase developed the transformers library, and it is modular and extensible. It includes thousands of pre trained models for a wide range of NLP tasks such as translation, text summarization, text classification, and others. So in this lesson, we will understand what is the transformers library, why use the transformers library, it's use cases as well as how to install. Let us start. So we already saw what is the transformers library here. Now we will see why use the transformers library. Transformers library is widely used because it is quite simple to use with complex NLP models. It provides you access to cutting edge models. With that, it is backed by a large and active community. It supports customization and fine tuning. With that, you can integrate the transformers library with other tools. Here are some use cases of the transformers library, classify texts into categories like text classification in the case of spam detection in emails. Also identify entities like names, dates, and locations in text, which is called named entity recognition. Translate texts between different languages from English to German with that generate text using models like GPT. Also implement quotien answering. That is to answer on the basis of a given context. Let us see how to install the transformers library. So here are different ways. Use PIP to install the transformers library. PIP is a package manager to download, install and manage Python packages and libraries. With that, you can also use Google Colab. Here you can find some difference in the syntax. There is an exclamation sign if you're installing it on Google Colab. With that, you can also install the transformers library directly from the hugging Phase Github repository. So let us see how to install it. We will use Google Colab for it. We will add the following command. Let's see. Here is our browser, I'll type, Google Colab and press Enter. Here is the link provided by Google only colab.research.google.com. Here you can see I already logged into my Gmail account, so it will directly open. I clicked on so it is asking me to create a new notebook here. These are my already created notebooks. I'll click New Notebook. So it is a free web application. So now we will use the same command here to install it. I'll show you again. Here is the command. Okay, let us type the same command. Okay, PIP space install space transformers. After that, what we need to do we need to just click this. Is written run here. Can you see runs Okay, so in this way, we can install the transformers library using Google Colab. You can add the name of your Python notebook here. So this created a Python Notebook. If you know Anaconda, you can easily guess what is a Python notebook. Save it from here and rename it later. So here I've just implemented this syntax to install the transformers library. In this lesson, we saw, what is the transformers library. We also saw that why it is so popular with that we also saw some use cases and how to install it. 5. Datasets Library of Hugging Face: In this lesson, we will understand the datasets library on hugging phase. With that, we will also see how to install it. Let us start. The datasets library provides easy access to a wide variety of datasets for NLP and other machine learning tasks. It is developed by Hugging Face and is a Python library. It makes it easier for developers and researchers to work with data for training and evaluating models. So in this lesson, we will see what is the datasets library, why to use it. So use cases of the datasets library with that how to install it. Let us start. We already covered what is the datasets library. So let us start with why use the datasets library. One of the reasons is efficiency, lazy loading, and steaming, make it easy to work with large datasets. The datasets can be huge, and we always need a library or a technology to ease the work of accessing and working on those datasets. So this library really helps. It has a unified API for processing datasets. You can also work with the Transformers Library and other ML frameworks with the datasets library so that integration and interoperability is possible. Thousands of datasets are provided by Huggingfas. It supports custom datasets and preprocessing pipelines. So before installing the datasets library, let me show you its website. So here is the link huggface.co, the official website of huggingface slash datasets. So you can see how many datasets are provided over 350 K, and here it is. If you'll click on any one of them, you can get all the details. In this tutorial, we will also show you how to download and access a dataset easily using Huggingface. I'll later see the use cases or the datasets library with easily load and preprocess datasets for tasks like spam detection. That comes under text classification. With that, you can also work with sentiment analysis and quotien answering. Using this, you can easily build quotien answering systems. Some datasets are also provided for translation with that named entity recognition purpose can also be fulfilled with some datasets already provided by Hugging Face. Load and preprocess your custom datasets using the datasets library. Now let us see how to install the Datasets Library. So you can use PIP PPAs package manager to download, install and manage Python packages and libraries. Just use the command PIP Space install space datasets. With that, you can also use Google Colab easily. But there is a difference between both the syntaxes. You have an exclamation mark for Google Colab. We will see it later. With that, you can directly download it from the huggfasGiTub repository using the provided syntax, the Git plus github.com hugfacelash datasets Tell S pip to install the package from the hugging Phase datasets repository on Github. Now let us see how to install the datasets library. We already saw Google Colab. So I'll just use the second syntax to install the Datasets library on Google Colab. So this was our Gold Collab. We already saw how to install the transformers library. We can install the datasets library here itself, but let me create a new notebook, go to File. Click New Notebook. Now a new Python notebook opened. Let us type the command to install datasets, Pip install PIP space install space datasets, and just run the cell from here. I've shown this before as well. Let's wait. Tik Mark is visible. That means we successfully installed it. You can also save it from here I told you before also save and let us add the name to our Python notebook. So in this way, guys, we can easily install the datasets library. In the upcoming lessons, we will also see how to work with them and its use cases. Guys, we saw what is the datasets library. We easily understood the concept, it's use cases also, and we also saw how to install the datasets library. 6. Tokenizers Library of Hugging Face: In this lesson, we will understand the tokenizers library of hugging phase. With that, we will also see how to install it. The Tokenizers Library is a fast and efficient library for tokenizing text, which is often used alongside the transformers library. We already saw the transformers library before in the previous lessons. So the tokenizer Library is a fast, efficient and flexible library designed for tokenizing text data. Which is a crucial step in natural language processing. Tokenization involves splitting texts into smaller units such as words, sub words or characters, then these are converted into numerical representations that ML models can process. In this lesson, we will understand what is the tokenizer's library, why use it, its use cases, as well as how to install. So let start, we already saw what is the tokenizers library. So now we will see why use the tokenizers library. It is quite quick for tokenization, that means optimized for fast tokenization, even on large datasets. It also supports custom tokenizers, as well as flexible enough to support multiple tokenization algorithms. Integration is possible. That means you can work it with other hugging face libraries like transformers. It has an easy API for tokenizing, decoding and managing vocabularies. With that, you can easily access pre trained tokenizers. Now let us see the use cases. Easily tokenize the textual data for classifying text for spam detection in emails. With that, you can also analyze the spam. Easily perform sentiment analysis, align the tokens with entity labels. It is also used for machine translation. Some of its other use cases includes text generation and even question answering. With that train and use tokenizers for domain specific data sets. Now, let us see how to install the tokenizers Library. We can use PEP PIPs package manager to download, install and manage Python packages, use the Syntax PIP space install tokenizers to install it. With that, we can also use the tokenizers Library on Google Collab. We already saw how to install a library on Google Collab. Similarly, we can use the exclamation mark PIP space install space tokenizers to install it. Also, the third way you can tell PIP to install the package from the hugging phase datasets repository. You can also use the third way that is directly installed from the Github repository, Type PIP space Install space Git plus the Github path to install it. Now let us see how to install Tokenizers Library on Google Colab. We will open Google Colab again. So here is our Google Colab. We already installed the transformers and Datasets library. We can install the Tokenizer Library here itself, but let me create a new Python Notebook, GodoFle click New Notebook. Now, let us type the command. Exclamation mark, PIP space install space tokenizers and click on the cell. Click here, Run Sell. Now the tokenizers library will get installed. You can also save this As I told before, it will create a Python notebook. So here, I'll type Amith underscore. You can add any name. And this is our Python notebook. Okay, we will utilize all these libraries later on when we will work on the use cases of Hugging Face. So, guys, we saw what is the tokenizers library. We also saw its purpose, as well as the use cases. With that, we also installed the tokenizers library on Google Colab. 7. Hugging Face Access Token (API Key) & How to Create: In this lesson, we will learn what is a hugging excess token. With that, we will also learn how to create it. Let us start. Consider an excess token as a secure string of characters. This is mainly used to access hugging phase services and resources. The hugging phase, API key and hugging phase excess token are the same thing. So in this lesson, we will see what is an excess token, that is an API key. With that, we will learn when do we need a hugging phase excess token. Also, we will understand that when the hugging phase excess token isn't required, in the end, we will learn how to create an API key. Let us start. So we covered what is an APK, that is an excess token in hugging phase. Now let us see when do we need a hugging phase excess token? Here it is. When you're using a private or gated model or an inference API, you need a hugging phase excess token. You must have heard about Meta Sama. It is a private model. To access it, you need to authenticate. That means you need to create an API key. You need a token. With that, if you're using the huggingpas inference API, then you need an access token to make API calls. Also, if you're uploading models or datasets or even spaces to the hugging Pace hub, you need an access token. Now let us see when do you not need a huggingfas access token. Obviously, if you're accessing public models which are publicly available to download and use, you don't need an excess token. Just like GPD two, also, if you're using the models via the transformers library of Hugging phase, you don't need an access token. These are publicly available and can be easily downloadable without any authentication, without any APAKe. Also, a lot of open source models are available to access these models, you don't need an API key, you don't need an access token because they are freely available. So in the upcoming lessons, we'll be working on these public and open source models only so that there is no need to create a hugging phase access token. Now, let us see how to create a hugging phase access token. So we will go to the hugging Phase website and we will create an access token. So let us start. Open the official website huggface.co slash Join and press Enter. So here it is, you need to join. That means you need to create an account on Hugging face. Here you can use your email address. So let me create my account. So here I have added account, my email ID. No enter the password. Here it is now click Next complete your profile here, add a username. Add your name. You can also add your Twitter user name. These are optional LinkedIn profile also. You can also upload your OTR. I'll click. Also, you can add your iTub user name as well as your website. As you can see, these are optional. Click I have Red. And after that, click Create Account. We have created an account. You need to check your email letters for a confirmation link. Now your account is verified. Your email address has been verified. Click on your profile. Go below. It's written excess tokens. Here it is. Click on it. Now, you need to create a new token by clicking here. Remember, do not share your excess tokens with anyone. Create new token. Add the token name. Let's say I'll type Demo key. Okay. Now go below. Click Create Token. The key created successfully. You can copy it and save it. Here it's written, save it somewhere safe. You will not be able to see it again after you close this model. Click Done. Now all your keys are visible. Here it is, we created a single key just now, and when you click here, you can edit it. You can edit the permissions, and also delete it. Okay, we saw what are excess tokens or APA key in hugging face. With that, we also learned how to create. 8. Download a dataset on Hugging Face: In this lesson, we will learn how to download a dataset from Hugging phase. For that, we will use the datasets library. Let us see a dataset refers to a collection of structured data, which can be used for training, evaluating or testing machine learning models. So Hugging pace is having a lot of datasets on its platform, which can be used for various use cases like NLP. We will use the datasets library to download a dataset from Hugging phase. Let us see. First, let us see the datasets. Go to the hugging Phase website slash datasets. So these are the datasets provided by Hugging face. You can see a lot of them. Let us see how we can download it. So we will go to the same platform, Google Colab, which we have used before in this tutorial. Here it is. Okay. So this is the notebook we already created. In this first, we installed the dataset library. I already told you how to install it on Google Colab using the PIP command. After that, we loaded a dataset using the load underscoe dataset. Function. This function can download datasets from the Hugging Face hub or load them from local files. We are downloading a dataset from the hugging face hub right now. Here it is. Okay, here we are loading the IMDB dataset. After that, I'm printing the dataset using the print method. Here we are importing the load underscore dataset function. This provides access to various public datasets like IMDB in this case. Here we are loading the IMDB dataset. The IMDB dataset contains movie reviews labeled as positive or negative or sentiment classification. When you run, it will automatically download and process the dataset. Here we are printing the dataset. This will split the dataset into train and test. Okay, that it will display an overview of the dataset, including the number of samples in each split. Let us see after running, here it is, it is showing us the structure of the IMDB dataset. As dataset dictionary, which organizes the dataset into different splits. The train contains 25 Kos with features text for movie reviews and label for sentiment like positive or negative. Here, for test, that is 25 Kos for testing purposes with the same features. It contains 50 Kos, but this split typically doesn't have labels for sentiment analysis. It's often used for tasks like pre training or semi supervised learning. In this you guys, we can download a dataset. 9. Download a model from Hugging Face: In this lesson, we will learn how we can download a model from Hugging Face. Let us see. So to download, we will use the transformers library. With that, we can also download directly from the Hugging Face Hub. Let us see a step by step guide to download and use models from Hugging Face. We will use the transformers library, which we already discussed. Let us see. Here is our VS code. We already created a notebook file, open notebook. So here we created AmTnderscoe download model already. In this, what we did first, we installed the transformers library. We already discussed that Hugging Phase developed this library. So we used PIP to install it on Google Colab. After that, what we did here, we downloaded a model using the transformers library. We have used the From underscore pre trained method for this. This method downloads the model weights, configuration and tokenizer from the hugging Phase hub. We are downloading a pre trained bird model. Here it is. After running what we will get, we ran this and we got the shape. This shape is commonly seen in models bit where each token in a sequence is represented by a 768 dimensional vector. When we use the Burt base hyphen uncased model and pass the input hello hugging phase, the last hidden state output shape represents the tensor dimensions. For this example, the shape you would typically see is the following. Here, one is visible. It is the bat size since there is one input sentence. Seven is the sequence length. This corresponds to the tokenized version of hello hugging phase including special tokens. That means the following hello hugging phase. 768 is the hidden size. Each token is represented as a 768 dimensional vector standard for birds base architecture. In this way, guys, we can easily download a model using the transformers library with Google Colab. 10. Sentiment Analysis using Hugging Face: In this lesson, we will learn how to implement sentiment analysis with Hugging face. We will understand what is sentiment analysis with its type. After that, we will run a coding example on Google Colab. Let us see so we already discussed the transformers library provided by Hugging Face. It is a powerful tool for task like sentiment analysis. Now what is sentiment analysis? As the name suggests, it includes determining the sentiment expressed in a piece of text, like, negative or neutral. So let's say I love cricket, so this is a positive sentence. Okay, I don't like something, I'll hit something, so that is a negative sentiment. Similarly, when I'll explain the types of sentiment analysis, Things will be more clear. First one is polarity detection that is positive, negative or neutral. I love this product is positive, obviously. The service is terrible, is not good, is negative. And when the things are not clear, it will be neutral, like the package arrived on time. Next comes emotion detection. Let's say you said, This is not good, this is pathetic. This is so frustrating. That is anger. And joy is expressed by a sentence like I'm thrilled about the results. So emotion detection includes happiness, frustration and other emotions. Then comes aspect based sentiment analysis, like sentiment towards a specific product or service. Like the food was great, but the service was slow. In this case, the food is having a positive sentiment, obviously. But since the service was not good, it is a negative sentiment. Then the intent analysis, like the intent to purchase something to complain. Let's say you said, We can I buy this product? So that is a purchase intent. So these were the types of sentiment analysis. Now, let us see the coding example. In this, we will use a public model. So we won't be creating an excess token because for public models, as I already told, we don't need it. We will run the code on Gool Colab. For efficiency, we can also change the runtime on Google Collab, so I'll also show you that with the example. Let us start Here is our Google Colab. Okay, let me open the code, file, open notebook. I already created the project. Here it is Sentiment Analysis. Here it is. So for efficiency, we can change the runtime type. Click the runtime menu, here click, change Runtime type. Okay. We can see we already selected the To GPU, not a problem. If your project is quite complex or you are having a large scale project, you can select the V two hyphen eight TPU also. I'll keep the same, okay? So initially here, what we did, first, we installed the required libraries, that is transformers and torch. Okay, we use the paper. We already discussed how to install it in the previous lessons. After that, we ran it using this runs in this we imported the necessary modules in this line. Here we loaded the sentiment analysis pipeline. The pipeline function provides a simple way to perform various NLP tasks, including sentiment analysis. You can load a pre trained sentiment analysis model as follows. So here, what we did we loaded the following model. Okay. After that, we performed sentiment analysis. Since we have loaded the sentiment analysis pipeline, use it to analyze the sentiment of a piece of text. So here, I love playing and watching cricket. These are my text, and I hate when we're at Collis is a century. So obviously, you can guess this is a positive sentence, and this is a negative sentence. You can easily guess it. So this is the sentiment analysis. Here the output you can see is a list of dictionaries. Here it is here each dictionary contains the sentiment label and the confidence se. Here it is label and the confidence core. So here we have analyzed multiple texts at once by passing a list of stings to the sentiment analyzer. Now let us understand the output completely. The score in the output of the hugging phase sentiment analysis pipeline represents the confidence level or probability that the model assigns to the predictive sentiment label. It indicates how confident the model is that the given text corresponds to the predictive sentiment. The score is a value 0-1. As you can see, the score closer to one means the model is very confident in its prediction. If the score was closer to zero, that would mean the model is less confident in its prediction. The label positive indicates that the model predicts the sentiment of the text is positive. That is the following. The negative means the opposite, that is negative. Here it is. So here you would be wondering why the score is so high, close to one. This is because the model we are using has been fine tuned on a large dataset and is highly accurate for sentiment analysis task. The input text likely contains strong unambiguous language that makes it easy for the model to predict the sentiment with high confidence, like ate means negative and love means positive. So in this way, guys, we can work on sentiment analysis with hugging face. 11. Text Classification using Hugging Face: In this lesson, we will learn how we can use the hugging phase for text classification. First, we will understand what is text classification. With that, we will also see the difference between sentiment analysis and text classification. After that, we will create and run an example on Google Colab. Let us start. Text classification, as the name suggests, can be used for spam detection. So on your email ID, you must have seen that some emails go to Spam, some emails are not considered as spam. In a similar way, you can also classify news articles or documents like sports article under the sports category, a tech related article under the technology category. And with that, it also includes a use case for intent detection, like to cancel an order, to book a flight, and others. So let us see till now, we have covered the sentiment analysis. So here is the difference between sentiment analysis and text classification. As the name suggests, sentiment analysis are narrow. That is specific to the sentiment. Let's say positive sentiment for a text like I love Cricket. In a similar way, the labels for text classification depends on the task like I just discussed about spam or not spam or different topics. With that, for sentiment analysis, we discussed before it is mainly positive, negative or neutral. Some use cases include classifying email as spam or not spam under text classification. Under sentiment analysis, one of the use case can be a positive product review. Now, let us see a coding example where we will detect spam or not spam based on a text. We will use a publicly available model that is the following. So we won't be needing any excess token from Hugging face for this. So let us see the example and classify text as spam or not Spam. Here is our Google Colab. We created these notebooks till now. Let us open our text classification notebook, open notebook. Here it is. We already created it. Let us see the steps. First, we will install the required libraries. That is to begin with the hugging face Transformers library as well as the torch library. So we have used the PIP install command for this. Let's go below. After that, we will import the necessary modules. Here we have imported the pipeline module. Then we have loaded a pre trained spam detection model that is the following here. It is freely available. So we did not apply any key for it from hugging face. Now the next step includes performing the spam detection. First, we have set multiple text so that we can detect whether these texts are spam or not spam. We have classified multiple texts at once by passing a list of strings. Here it is. We have mapped labels to spam and not Spam here. Here it is label mapping. Negative means spam, neutral means not spam, positive means not spam. Okay. To display the results, we have used the four in loop. Here it is. What will happen? A score will be visible in the output. Okay. The output will also include the label, whether that takes it a spam or not spam. With that, the score will also be visible. These are the confidence scores. Here it is so according to our model, the first text is a spam. Obviously, because it is showing congratulations. We have one of 500 INR Amazon gift card, click here to claim now. The second one is not a Spam. Obviously, hier Myth. Let's have a meeting tomorrow at 12:00 P.M. So obviously, this is not a spam. The last one is also considered as a spam. We get a lot of such spam emails that your Gmail account has been compromised. Here, the confidence interval is displaying the score. Low confidence scores indicate that the model is uncertain about its predictions. The following model is fine tuned for sentiment analysis, but not specifically for spam detection. We are still adapting it for spam detection. Okay. That's why here it is showing not spam, but the confidence score is even less than 0.7. I told you that low confidence scores indicate that the model is uncertain about its predictions. You can set a different model here from the hugging phase. Here we are showing an example. So in this way, we can use the transformers library on hugging face to detect Spam. That is to perform text classification. 12. Text Summarizations using Hugging Face: In this lesson, we will understand how to perform summarizations using hugging face. First, we will understand why we need to summarize, and then we will see a coding example on Google Colab to summarize text. Let us start. The hugging phase Transformers Library, as you already know, is used for NLP task. That includes summarizing text as well. So why summarization? Summarization is actually used in a lot of real world applications. You must have seen summarizing long articles into short snippets with that summarizing documents or research papers. Chatbots also provide quick concise responses. With that, you can extract key points and summarizations from a document and from large datasets also. Now let us see an example. Here we will use the following model, which is publicly available on Hugging face. So we don't need to add the excess token. Okay, we will run the code on Google Colab like we saw before. So let us see the code. So here is our Google Colab. We will open our code file, open notebook. So here we are discussing about summarization. Here is our code. First, what we did, we installed the required libraries. So we have installed the transformers as well as the Pytoch library here using the PIP space install command. We already discussed this command before. After that, we will use Atomdel for sequence to sequence LM and auto tokenizer for more control over the process so that we can load the model and tokenizer directly. So this is what we have done here. Here we have loaded the following pre trained model for summarization. So here we have set the input text to summarize. So this is our text. We will summarize this. First, we have tokenize the input text using the following. Okay, so here you can see some parameters. These parameters will control the summaries length and quality. Max underscore length is the maximum number of tokens in the summary. We have set 512, so here the summary will be no longer than 512 tokens. We have tokenize the input text here. To generate the summary, we have used the generate method. Here we have some parameters for the input, the following, tokenized. Then the max length, which is the maximum number of tokens in the summary. This is the minimum number of tokens in the summary. Length underscoe penalty. What is this? This encourages longer or shorter summaries. Here it is two, that means longer summaries. Num underscoe beams controls the beam search width, higher values, improve quality, but slow down inference. Here we have set it to four. That means four beams for decoding. Okay, so here was our input, and this is the summary here. We have printed the summary here. We have summarized it. Okay. So in this way, guys, we can use hugging phase. So in this way, guys, we can summarize text easily. 13. Text to Text (Translate) using Hugging Face: In this lesson, we will understand how we can perform translation using Hugging Face. That is text to text generation. Let us see for translation task, we will use the hugging phase transformers library. Some models are already provided for this. So translation, as we all know, includes, let's say, translating English, text to Spanish. This is a part of text to text models that requires a task prefix to specify the type of task, for example, translation, summarization and others. Text to text generation includes not only translation but also summarization, paraphrasing, question answering, and even sentiment classification. So let us see the difference between text to text and text generation. So text generation is used for autodgressive text generation where the model generates text sequentially one token at a time, like dialog systems, text completions, and others. The text to text generation class is used for sequence to sequence task where the model will take an input sequence and generate an output sequence, like your text summarization, paraphrasing, and even translation. Now, let us see an example to perform translation using the Hugging Phase Transformers library. In this will use a model T five underscore Small, which is publicly available on Hugging phase. This model is a smaller version of the T five model and can be used for tasks like summarization, translation and even quien answering. Let us see the example on Google Colab. Here is a Google Colab. We just saw the text summarization example. Now let us open. Now, let us open the translation example. Here it is. First we will install the required libraries. That is the following here. We have used the PIP space install command. We already saw this command before. After that, we will load a pre trained translation model. That is here we have loaded the T five model. It is a versatile text to text model. That can handle translation by prefixing the input with a task specific prompt. So we are loading a T five model. Here. Here we have prepared the input text. So this is the text we will translate, translate English to Spanish. That is the following text we'll get translated. Tokenize the input text into input IDs that the model can process. Use the model to generate the translated text. You can customize the generation process with parameters like max length, Nam underscoe beams we saw in the previous lesson also. Here the output tokens will be decoded to text, and after that, we will print the translated text. So here is the output, translated text. My name is Amed Devan and I love cricket. So here it is translated to Spanish. So in this way, guys, we can perform translation. 14. Question Answering using Hugging Face: In this lesson, we will understand how to use the hugging phase for quien answering. We will also see an example. Let us start. So we will use the transformers library of hugging phase for performing Quotien answering task. We will run the code on Google Colab. So here we will use the following model, which is publicly available on hugging phase, so we don't need to create an excess token for this. Let us see the example on Google Colab. So here we will open our code. First, we will install the required libraries we have shown. We have used the same PIP install command which we saw before to install the required libraries. After that, we will load a pre trained QA model and tokenizer. Here is our model and the tokenizer. Prepare the input for QA task. That is for the quotien answering task. We need a context as well as a quotien. What is the context now? It is a paragraph or text where the answer might be found. That is the following. I'm providing a context also, and here is the quoti. So this is about me, and here is the, the question you want to answer. Okay. So we have said both. After that, we will tokenize the input, tokenize the context and quotien using the tokenizer. We have done both. Get the model's prediction, pass the tokenized input to the model to get the answer. That is the following. It will extract the start and end scores also. I will get the most likely start and end positions here and it will use the same to convert token IDs back to words so that the answer is displayed here. Answer and disco tokens will be set here and it will be decoded back towards. This will have your output. So here the quotien was where amid the onean is based. The context was the following, and the answer is deli. So in this way, guys, we can perform answering easily. 15. Text to Image using Hugging Face: In this lesson, we will understand how we can perform text to image using hugging face. Let us understand with an example, so here we will use the Hugging Face diffusers library. This example will use the stable diffusion model also, which is one of the most popular text image models available in the diffuser library. Now, what is the diffusers library and stable diffusion? The diffusers library is an open source Python library to focus on diffusion models for generating images, audio, and other types of data. These are a class of generative models only developed by Hugging Face. What is stable diffusion? It is a latent diffusion model designed for high quality image generation. So you can generate images from text proms using this. It is also one of the most popular generative model. Let us see the example. Here we will use a publicly available model on huggingface. Let us see the example and convert text to image. The output will be generated as an image on Google Colab itself. So let us see. Here is our Google Colab. Let us open our notebook for text to image. Here it is. First, we will install the required libraries using the same pip install command we already discussed. So now we will load the stable diffusion pipeline. The diffusers library provides a stable diffusion pipeline that makes it easy to generate images from text prompts. We will load the stable diffusion model here. Now generate an image from a text prompt, easily generate an image by passing a text prom to the pipeline. Here is our prompt. Flying cars soar over a futuristic cityscape at sunset. The following will generate the image. Okay. Here is our image. This image will get saved on Google Colab only using the same method, and it will also print image saved as generated underscore image dot PNG. PNG file will get generated, where it will be visible on Google Colab, click here. You can see files. Now, I'll run it. I'll run it. I'll run this now. Now, I'm running to generate an image and save it. Okay, so here is our image. It's written. Image saved as generated Underscore image dot PNG. Okay, so it generated it. I'll just go here from here. You can download it. You can also copy the path. I'll click Download. It downloaded. Okay, here it is. So we generated an image that is text to image. 16. Text to Video using Hugging Face: In this lesson, we will understand how to perform text to video using urging face. This is called text to video synthesis. We will understand what it is, and we will also run a sample example. So let us start Text to video includes generating video from textual descriptions like typing a text and generating a video. Like we saw in the previous lesson, text to image, we typed a text and generated an image. In this case, we will generate a video. So we have a lot of pre trained models and tools for generating videos. Hugging face provides the same models. The text to video synthesis Tom I just told it includes generating a sequence of frames based on a textual description. Since it's a complex task, it requires combining different NLP models with generative models or even diffusion models. Okay, diffusion models we saw in the previous lesson, it is used for generating images or videos. Let us see some video generation frameworks before moving towards the example. One of the most popular ones are runway ML. It offers tools for video generation and editing. Mostly for a generated videos, you can use PIA labs. With that deep mind perceiver IO can also be used to handle multimodal inputs. Multimodal input can include text images and even videos. You need to use the library like Pitch or tensor flow so that you can build pipelines for generating video frames. Let us see an example. So here we will use the difuser library also. We already discussed the difuser library. It is an open source library developed by Hugging Face and used for generating images and even videos. We will use the publicly available stable division model. In our example, we will run the code on Google Colab like we saw before. Let us start. Here is a Google Collab. Let us open our code, file, open notebook. We will open our notebook for text to video. I'll type video only to search. Here it is. First, we will install. So here we have used the pip install command to install the transformers, as well as the diffusers library also with Pytorch. After that, we will load a text to Image model, so here we are loading it. This is the model I already told you. We have used the diffusers library to load a pre trained text to Image model like stable diffusion. Generate frames from text first, we have set the prom. Here, we will generate individual frames based on the text description. This is the text description, a futuristic cityscape at night with flying cars. This will generate ten frames using the foreign loop. Here it is ten, and it will append. Later on, we have used the OpenCV library to sketch the frames into a video. Here we are using the OpenCV inside the foreign loop so that we can stitch it. We have also used the Numpi library. We are using the Numpi array in it. So this will save frames as images. And this will stitch the frames into videos, and the following will display the output, which is gathering the frames using the foreign loop. And the output will be displayed like this in a form of frames. So here when I run, it will display me ten frames because we are generating ten frames here. And after that int end, it will display the video. So here is the output, the output video will have the following name output underscore video dot mp four, but it will also generate frames. How many frames? Ten frames. The format of the frame will be the following frame underscore the value of I. So the frames would be like frame underscore zero dot PNG, frame underscore one dot PNG, and it will go until nine. That means ten frames. And the output will be here, I told you. Now let us run it. A Now we will click here. And here you can see I told you it will generate ten frames. Frame underscore zero dot png till nine and output video will be here. So this was the output. I'll just click here and click Download. Download it. Right click and open. Here is our video. Okay, you can see ten frames. So in this way, guys, we can generate video from text with hugging face. Thank you for watching the video.