Transcripts
1. Introduction: Section on secure file
uploads and validation, this is really a
continuation from our previous section where
we're going to extend that sample project to make sure that our file
uploads are done in the most secure manner possible because every
upload is an entry point, and we have to protect that entry point
before it's too late. In this section, we're going
to explain and explore common file uploads risks,
including common attacks. We're going to implement secure file upload
handling using Aspeed one core and the
built in IFOmFle data type. We're going to look at
validating and sanitizing input. And what I think is the most exciting part
of this entire section is the fact that we're
going to explore implementing
anti-virus scanning. So this is going to be a very in depth and
interesting section. Stick around. I know
you're going to enjoy it. I'll see you in the next lesson.
2. Understanding File Upload Vulnerabilities: Moving on to another
dangerous form of input, and that is file uploads. So why are they dangerous? Well, file uploads
can introduce malware or executables
disguised as documents. They can lead to path traversal
attacks where an attacker might be able to navigate to a particular directory or file that they
shouldn't be able to. They can escalate their
privileges and get access to sensitive
parts of your server, and attackers can use uploads to store web
shells or scripts and override configuration files or leak sensitive
files like logs, keys, and config files. Sometimes in handling
file uploads, we make some dangerous
assumptions, like we'll say, Oh,
it's just a PDF or we don't execute
uploaded files. We save it outside
of Dub Dub Dub root, so it's safe or the
OS will protect us. And a lot of the times, well, if our user base has no
ill intent, then sure. But remember that we are
treating all input as evil. So let us start by looking
at the potential for malware or executables to
be uploaded to our system. So uploaded files are untrusted
binaries, not just data. File extensions and man
types can be spoofed. Executables can be disguised as documents like PDFs
images or zip files. They can be embedded
inside other files, and storing malware is
often the first stage, but that's not the
attack itself. That's the first phase. But then downloading
or processing the file can activate
the payload. So that is one type of attack that can be launched
through file uploads. The next is path traversal. Now, the path traversal attack, also called directory
traversal is a vulnerability
where an attacker manipulates a path input. So your application
reads from or writes to a location outside of the
intended or target directory. It is fundamentally a file
system injection problem. Untested input becomes a
part of the file path, and the attacker
uses that to control or navigate to the server's
directory structure. So the attack pattern
looks a little like this. They'll upload a file, but they'll add all of that as the path where they're using the dot slash sequence to
escape the intended directory. And the goal here is
to read or write files outside of the intended or
target directory structure. Now, the impact that this
can have is they can overwrite up or OS files or
expose system information, or it can enable
remote code execution. When mitigating against
these kinds of attacks, we have to implement certain measures like the
principle of privilege, where we ensure that the web app hosting account or in Windows and IIS servers, the a pool identity should write only to a
designated upload directory, have no access to the Dow
root or system paths, and well, you want to keep the default identity
that is assigned in IIS, do not replace it with a user identity that has a higher privilege
like local system. And if we want to
prevent path traversal, we can normalize paths using
certain built in functions. We can also sanitize file names
and combine paths safely. We can enforce directory
checks to make sure that whatever is being uploaded
or done is a safe path, and we can also throw unauthorized access
exceptions whenever there is an invalid file path. So there are several ways
to secure the file uploads, and we're going to
be going through them in the next few lessons.
3. Preventing Path Traversal Attacks: The first thing that
we want to triage is the potential for a
path traversal attack. So let's jump over to
our files controller and review the vulnerable code and see what we need to triage. In this controller, we're
setting the upload root to be some path where we're setting it inside of the
content root path, and then we're creating a folder called updata and downloads. So that is the destination path. So if it doesn't exist, then we go ahead and create it. In the upload method, we're firstly making sure
that our file has some data. Otherwise, we'll
say bad request, you didn't provide a file. But then we're going
to go on to upload. Now, here's the
specific vulnerability that we're addressing right now, which is the file traversal, where an attacker can provide a file name with these
escape characters. So each time you see dot slash it means I'm going
to the previous directory. If they provide enough of this
and they know that they're dealing with a Windows
system or a Linux system, they can deduce how
that file system looks, how many dot slashes
need to be used to come out of the folder that the destination file is going to and
potentially going into the Windows system
folder and then potentially plant some
evil file in there. And then we're setting
up the file name using this exact filename
that was uploaded, and then we're putting
it inside of that path. So all of this here introduces the potential for path
traversal attacks. So in order to simulate this
attack, we'll need Fiddler. Go to tell ec.com slash Fiddler, and you can try it for free, and it is available for
all operating systems. And once you launch,
it will actually start monitoring all of the web
traffic on your computer. So you'll see this pain
starting to fill up. But we want to focus
on our API activities. So quick ways to configure this would be
to go to the filters tab, enable use filters, then enable show only the
following hosts, and then show the local host with the port of the running up. In our case, it is
local host Colon 5,001. Once you click outside
of this text area, it will commit that change, and then the pain should
stop filling up as much. The next thing you want to
do is enable breakpoints for when requests are going out. So you're going to go to rules, click Automatic breakpoints and then you're going to
enable before request, or you can just press F 11, or you can click in this
section right here. It's not very intuitive, but I prefer to
just go to rules, automatic breakpoint so I know exactly which
one is enabled when. So once you have set up Fiddler, you can launch the API. You may notice that the Swagger
page is failing to load, and if you open Fiddler, you see that it's hitting
that automatic breakpoint. So I'm going to disable
the breakpoint for now by just clearing it out of
that section of the window, and once Swagger loads,
then we're good to go. Now, I'm going to
try a file upload. And I'm putting in a
valid patient intake ID. And if you're not sure
what the IDs are, you can always go to the
intake slash search, and you'll see the
different IDs. And I'm uploading a file, simple file test dot THD. So when I execute that
I get a 21 response. If I go back to Visual Studio, I will see the new
folder created updater, uploads and test. Now, if I want to perform
a file traversal, the good thing about the
operating system is that it limits the characters I
can place in the file name. So I cannot just go and
change a file name like this. That is, well, a good
line of defense, but using a tool like Fiddler, I can actually
intercept the request. So now, I'm going to in Fiddler, clear everything out, and then I'm going to re
enable the breakpoint. And go back and I'm going
to submit the same request. This time Fiddler intercepts it. And if I click on it
and go to inspectors, I can then see the
details of the request. Here you can see headers, text view, syntax view, et cetera, we're going to raw. And from the raw view, I can look at the
content being submitted. Now, if I introduce dot
dot slash dot dot slash, in front of the file name and then click Run
to completion. I will complete the operation. And if I look in the
file system, this time, test that TXT ended
up in the folder. So we have finished then SRC, and it ended up in a
folder or directory that is completely outside
of the API directory, which is the directory that has the update or uploads
and destination. So right there, I was able to plant a file in a place that
it should not be planted. This is a very simple
demo, of course, but I'm just showing
you that it is a very real attack, so
we need to tri arch. Now, ground zero for
this attack is a fact that we are retaining
the original filename. So the filename that came
in through the request, whether it came
directly from the form or came from Fiddler
in our situation, we don't want to keep it. So here you'll see the suggestion
is that because there's no good or sanitization,
there's an issue here. So the first thing
I'm going to do is retain the path extension. So now I know what
extension it should have, but I'm now going to create that stored file name to be a new GOId with that
file extension. So I'm no longer going to care about whatever name
you called it. I'm now going to
give it its own name for my storage purposes. That being said, if you
have to store it in the database later on
and upload the record, you can have the
original file name, which is whatever was uploaded, but then we're going
to store it using our own internal standards. Now after that, we go ahead
and we do a file combine. Now, I'm going to go ahead
and get the full path. So the full path here will be
whatever the file path is, which would be everything
directory up to the file name. And then as another
safety check, I'm going to say if
the full path does not start with the
upload root folder name, then we're going to log the security breach and
return bad request. So no matter what, if
somehow we ended up with a different file path
from the upload root and our newly formed file name, we're still going
to make sure that our destination is
at the minimum, the upload root folder. Now that we've added that fix, let me go ahead and run and remember that
we need to disable the debugger until we're sending the requests that we
intend to evaluate. So let's make sure that swagger
loads as we expected to. And then let's go ahead
and try our upload again. But before I submit, I'm going to re
enable that debugger, just to make sure I intercept that traffic, and let's clear, just to make sure we have
a clean slate, execute. Jump back over to Fiddler and let us fiddle with it again. And I'm just going
to do that slash. I think that's enough. It means it will go to a different directory and
then run to completion. We get our two oh one. If I jump back over to the code, I'm now going to see that I
have a different filename. Altogether, it's now just
a good with the extension. If you use Equalit manager
and check the database, you will see the originally
submitted file name as well as the new filename.
4. Limiting Upload File Size: Going to be adjusting the maximum size that any
uploaded file can be. So here, you'll see that the
first warning is letting us know that an attacker and maybe not
necessarily an attacker, but any user can upload multiple gigabyte
files or there's just no limit on the size of
the file that can be uploaded and it can lead
to exhausted disk space. Generally speaking, you have to be flexible with this
because there are times when you need a
file that can far exceed the expected size of
what a document would be. So we're going to implement validation that is
flexible enough that it will allow us to manage how these
file sizes look. To do this, I'm going to make a few adjustments to
our code as it is. So we have our from form
file upload request, and all we're really
getting from here is the patient intake ID. Now, we do have
this vulnerability. I'm going to remove that and just make sure that
this is not optional. Flag it as required
to ensure that no file ever gets
uploaded without this ID. But that's not really
why we're here. I want to add the iFore file
to this request detail, it's actually a part of the DTO that is
accepting the data. Add this as a property and this now allows me to add
any attributes over it. So I have a custom
attribute that allows me to specify the maximum file
size that's uploaded. I already did it, so we'll
just go through it together. I'm calling a MAX
file size attribute, and it can be used for any
property or parameter. And I'm defining a field called
underscore MAX file size, which gets initialized
based on the MAX file size, parameter that's passed in
through the constructor. Is valid. I'm simply saying if the value that is being
validated is Form file, then we're going to
return true or falls for if the length exceeds
the max file size. We're also going to format an error message if we
need to return one. So how do we use it? Well, we know how to use our
custom attributes already. So I'm just going to decorate
or I form file property. Let me rename this to fit in
with the naming conventions. Now, decorating this
with the required flag. So of course, if you're uploading a file,
then we need the file. And we're specifying the
MAX file size based on this new custom attribute
to be 1024 by 1024. In other words, 10 megabytes. So because we can do this, we can now add custom sizes anywhere we're using this MAX file
size attribute. So for the patient intake, 10 megabytes is the maximum
we're willing to take. But maybe you're doing a medical facility management
system where they have to upload high
resolution images where it might be 50 megabytes, it might be 100
megabytes, right? So at that point, whenever you're
validating the file size, you have that flexibility. So now that I have that
file upload request, I have to do a few
factors over this side. So anything that was just file now becomes request dot file, so I'll just go through
and change those out. And I can now remove this
If statement because, well, the validation is checking
and saying it's required, so I don't have to check if
there was a file anymore. So that's a little cleaner. Let us test our changes. I'm going to execute our application and fill
out our upload form. And when I execute using the same test that
takes the file, we get our two oh one, so
we know that this works. Now, if I go ahead and
choose another file, and this is called BigFleEx just a file that I have
that exceeds 10 megabytes. Getting a 400 letting me know the maximum file size
allowed is 10 megabytes. So just like that, we were
able to limit the file size. Remember, use this at your
own discretion per endpoint, per upload file type, but it really is
that easy to make sure that the files you're
accepting are not too large.
5. Validating File Types: We're moving on to
validating file types. So one of the comments
here warns that we are currently
accepting executables, DLLs and various
script file types. And these are very
dangerous because it means an attacker could
actually upload evil dot EXE. So while we may
not allow them to upload it to maybe
a wrong directory, we are still enabling that evil EXE file to be
uploaded or script file, and that is a potential
point of attack. Now, there are several ways
that this can be done. Way number one is that we
simply check file extension, and that sounds effective because we can get the file
extension and we can check for the known dangerous ones or create a wit list and make sure that a file extension
is in the whitelist. However, somebody can easily change the extension of a file, so they can have
an executable file or a script file
and call it a PDF, and then you think it's a PDF, and when you open it,
it will then execute. So that is not an
effective method of checking as simple as it is. Another way is to
check the mime type, which is basically
metadata about the file that kind of clears
what kind of file it is, but it can also be spoofed. So in this lesson, we're going to go through a more comprehensive
method of checking. And in the interest of time, I'm just going to
quickly show you the setup and then we'll
go through the code. So in the folder structure, I've introduced a new
folder called services, and here we have file
validation service. In the upsettings that Jason, I've also added a section
called file validation, and I have a section for
denied extensions and a section for allowed
signatures by extension. But notice that this is
a dictionary where I'm defining the extension
and the signature. So this is what each of these
file types should have as the header on the file before we can verify it is
that type of file. Also, there are
certain extensions that can introduce
vulnerabilities. For instance, SVG can
enable XSS attacks. So if we want to
outrightly deny some, we can do that, and we can have our allowed ones
where we whitelist. Also, in the program CS, I have one registered the model that represents
our upsetting Jason. I'm going to show you
that model in a few, but we have file
validation options, which is to be
mapped to the file validation section of the config I'm also registering our new file validation
service and the interface. Let's jump over there
to see what's here. We have that interface, and we are defining a
method called validate, which is going to return
validation result, and it is taking a IPM file
and some cancellation token. In the implementation, we have our file validation
options being injected. So we're extracting the
value from the config. We're also setting the
max signature bytes, where we're looking at the bytes of the file that is coming in, and we make sure that
we read enough bytes to match the longest signature
in the allow list. Now our models are defined as
the file validation result, which is what we will
be returning from our method, validate Async. And it's simply saying
is valid through or false giving a reason
and a matched extension. And we also have the
file validation options, which we mentioned before, which is going to map to
our denied extensions list, as well as our allowed
signatures by extension list. So let's quickly go through
our implementation. First of all, we're making
sure the file is not null. This check, of course, is maybe optional because if
we have the required flag, then obviously a file will be present before it
gets this far in the code, so you can decide if you
want this check or not. Then we're getting
the extension. And if the extension is empty, meaning somebody try
to upload a file, and there are times when
you have files that don't necessarily have
an extension to them. If they try that,
then we are going to reject it right
from the get go. Then we're checking our
denied extensions list. If the extension appears
in that blacklist, and here we're doing
that SVG specifically, but you might have
other ideas as well, then we reject it. Then if the extension is
not in the allowed list, then we reject it as well. So I'm just showing
you can either do the blacklist or do
the allowed list. But for the allowed list, what we're going to do
is read the header off the file relative to the
max signature bytes. So we're reading as many bytes as the largest number
of bytes that we need. And then we're going to compare against the signature for
anything that is allowed, and that's what that logic does. It's relatively simple.
It's just going one by one to see if
each byte matches. Then if everything is okay, then we return through. There's a match where
whatever is uploaded matches, and if it gets this far, then it does not match. Then you'll see here the
definition of the method to read the header and the definition of the method to change
the heck to bytes. So you can spend
some time and go through those blocks of code. But that is essentially what our file validation
service will be doing. Now, jumping back
to our controller, I will now inject our
file validation service. I can inject it to
the whole class, but I can also just inject it to the exact action
that will be using it. So I'm going to say from
services and choose that I file validation service
and know that we have that, we can add a simple enough I check right at the
start of the method. And we also need that
cancellation token. So I'll just add that
parameter to the method. So if the upload
operation gets canceled, then this operation gets
canceled wherever it is. So that's a nice way
to securely write asynchronous methods by making sure the cancellation
token is present. Let us take this
for a quick test. I'm going to start off by
testing with a JPEG file. And we see where we got a response suggesting
that it was successful. Let's try again with a file
that has no extension, and it was rejected,
rightfully so. And finally, I'm going to
test with that JPEG file, which is still a JPEG, but this is not really a JPEG. It's actually an executable, and I change the extension. So when I execute,
it will reject it because file contents do not match the
declared file type. So this is a nice, comprehensive way
to make sure that even if the extension
looks okay, are verifying that
the file type or the contents of the file actually match up to what
is being advertised.
6. Adding Anti-Virus Protection: Need to deal with the
vulnerability where there's no virus scanning. We're validating the file type, we're validating the extension
and where it is going. But up until we store it, we still don't know if the
file is a virus or not, if it matched all of the
headers and other validations. We want to make sure
everything lines up. What we need to do now
implement some form of quarantine situation where if the file is to be
flagged as malware, then we move it somewhere else and we do not
proceed to actually upload it to the file system or to the regular storage
area of the file system, and by extension, we don't
store it in the database. For this lesson, we're
going to be using CLM AV, which is an open source
antivirus engine that can be used for our
educational purposes. Of course, in a
corporate setting, you're advised to use a standard that your company
has set as the antivirus, but we're going to be using
this because there are packages available for us to be able to integrate with it. So you can go to clam av.net, and you can go to download. There are several setup
options available to us based on your
operating system. But the one that
we're going to be using is via a Docker container. Docker is cross platform, so it's the easiest option
regardless of your OS. If you don't have Docker or you're not very
familiar with Docker, you can go to docker.com
and you can download it. It is also available for
all operating systems, and I will be going through enough to help you
with this exercise. However, if you want more knowledge on Docker
and how you can use it to build Cloud native
apps with asp.net core, then you can check
out my course on Cloud native app
development where I bring it through the basics of Docker and how you can
continuize your apps. Or for now, go ahead and set up Docker and once you
have installed it, then jump over to your CLI, and then you can start
the Docker Damon or background service the
command start Docker, just to ensure that
it is running. The Docker Start
command might look differently based on
your operating system. So if you're on Mac OS or Linux, then it may look a
little differently. I suggest that you
go and find out the equivalent so that
you can get started. However, we're here
to install Clam AV. And then once the image has been pulled,
we want to run it. So we're going to
say Docker run IND, and then we're going to
name this new container, clam container 01, and
we're specifying a port, which is going to be 3310, and we're just mapping it to our internal port
of the same number, ensure that you have no other
apps running on that port, or you choose an
empty port number for this first port where you don't have any apps that
might be using it. You can change that
one, but this would be the forwarding port
for the app itself, and then we specify
the image name. So once we do that, it will start up a container. Let's jump over to the
code side of things, where we're going to start with introducing a new
package called NCAM. This is a wrapper around
the connectivity needed to speak to our
running container or wherever that
NCAM service is. So go ahead and install the latest package
that's available. And I have introduced a new service called
Clam AV scanner. So I have in this file an interface called
I antivirus scanner, which does a scan
and does a pin, and we're returning scan result. Scan result is defined as a class here in
the models folder, and it gives us the
status, the virus name, or raw response if it
is clean or infected. We also have the Enums for
clean, infected and error. And we also have Clam
AV options that will map to section of our upsettings that I'll
show you in a second, where we're defining
the host, the port, and we're doing local host and the port that we
would have set up. So in our upsettings JS on file, we have this new
section, CLM AV, where we have our host and port. So we define them here
if they were missing, then the class has
the defaults, anyway, and we define all
defaults as needed. Now jumping back to
our implementation for Clam AV scanner, which is inheriting from our
I antivirus scanner class, we are defining our Clam
client and a logger. And in the constructor, we are initializing what
we need to initialize. So we have this new class
called Clam client, and we can pass in the
host and the port values. Ping is simply checking
is the service available. So this will return it will either return with
an exception we catch and return falls or if it returns something useful,
then we return true. The done, we're doing
the scan acing. In a nutshell, we're
passing in the file stream, and then we're simply sending
it over to the client. We're using the
method send and scan, and then we retrieve
that scan result and return an object as needed. We also catch any exceptions whenever that operation
might have failed. All the most significant changes are in our files controller, and I've made them already. First change is the introduction
of a quarantine folder. So now we have a folder instead of up data
that I'm calling quarantine, and this is where files will go whenever they are uploaded, and then they will
be scanned before they're moved to the
actual target directory. I've modified the
upload method to one inject our antivirus
scanner service. And then anywhere that we were uploading to the
target directory, I have now changed to
the quarantine path for that target directory. Now, the real
change that we want to investigate here
is with the scan. So we save the file
to quarantine path, and then we're introducing our virus scan while the
file is in quarantine. Open the stream in quarantine
path and then call our scan acing method and
retrieve that scan result. The scan result is
going to be evaluated. If the status is clean, then we're going to set our upload path,
close the stream, which is very important
because if we don't do that, then it will get an exception that the asset is still in use, and then we can
move the file from quarantine to the upload path. Otherwise, if it is infected
or there was an error, then we're going to have
some other business process. The file stays in quarantine, you may even want to delete. You may want to flag
or send some alert to some cybersecurity resource
at your organization, whatever your
internal process is, I'm going to make a modification
to my own logic here. So I've replaced
the break statement with a return bad request
if it is infected, if there's an error or if an exception is caught
during the scan. Now, the true test if
this works will be to introduce some at least
simulation of a virus. So there is this website, secure.acar.org
slascardt com dot TXT. Here they have a TxD file, which has a sequence that every antivirus will
see as a virus. So when you download this file or you save the
file with this sequence, be sure that you save it
in a directory where you have disabled the
antivirus scanning from because your
antivirus will see it as a virus and you won't be able
to actually save the file. So for testing and
educational purposes, feel free to use this
just to verify that our clam AV scan works. So before we jump
over to a full test, to verify that TXT
files can be uploaded, I've extended the list. So now that TXT is a
part of that list. And with that change, we
can go ahead and do a test. So my first test is to upload our standard test dot TXT file. And you'll see here that
the scan status is clean. Everything went okay with it. Now I'm going to try with the file that
contains the virus. And when I execute, you're going to see no
viruses are accepted. So that means it failed
that virus scan, and that is how we know that
our Clam AV is working. If you were to look at
the containers logs, you would see where it
shows that in stream ICR test signature was
found in the uploaded file. And with that, we
have closed the loop, at least provisionally on secure file uploads
in our system.
7. Review: I hope you had as much fun completing this section
as I did making it. Thank you for sticking it out and let us take
some time to review some of the major talking
points regarding file uploads. Firstly, you want to treat every uploaded
file as untrusted. There are several ways of attacking your file system
through uploaded files. The first one is
through path traversal. They want to make sure
that you do not trust the filename that is
given with uploaded file. It is better for internal
storage to create a random filename
and then you can make the correlation
in the back end. Also want to validate
the file size, the content type, the
extension of the file. Make sure that there are
limits and limitations set in your system to
ensure that users do not freely upload
every and anything. And most importantly,
you want to implement virus scanning to
the best of your ability, make sure that you
have a quarantine area where anything that can be flagged as malware
goes so that it does not interfere with
your core systems. So thank you for joining
me in this section. I hope you enjoyed it, and
I'll see you in the next one.