How To Optimize Memory Use In C# | Mark Farragher | Skillshare

Playback Speed

  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x

How To Optimize Memory Use In C#

teacher avatar Mark Farragher, Microsoft Certified Trainer

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Watch this class and thousands more

Get unlimited access to every class
Taught by industry leaders & working professionals
Topics include illustration, design, photography, and more

Lessons in This Class

17 Lessons (2h 59m)
    • 1. Optimize memory use in C#

    • 2. Introduction to .NET memory management

    • 3. What is stack memory?

    • 4. What is heap memory?

    • 5. How are value types stored in memory?

    • 6. How are reference types stored in memory?

    • 7. What is boxing and unboxing?

    • 8. How does garbage collection work in .NET? in

    • 9. Tip #1: optimise your code for garbage collection

    • 10. Tip #2: avoid class finalizers

    • 11. Tip #3: use the dispose pattern

    • 12. Tip #4: avoid boxing and unboxing

    • 13. Tip #5: do not add many strings together

    • 14. Tip #6: use structs instead of classes

    • 15. Tip #7: always pre-allocate collections

    • 16. Tip #8: do not materialize LINQ expressions early

    • 17. Course recap

  • --
  • Beginner level
  • Intermediate level
  • Advanced level
  • All levels

Community Generated

The level is determined by a majority opinion of students who have reviewed this class. The teacher's recommendation is shown until at least 5 student responses are collected.





About This Class


Modern computers have loads of memory. But it's very easy to burn through it all in seconds if your code is not efficient about allocating and using memory.

Did you know that one simple mistake can make your code allocate 1600 times more memory than absolutely necessary?

Don't be 'that developer' who keeps crashing the development server with an OutOfMemory exception!

And you certainly don't want to be responsible for inflating the hardware budget. Can you imagine having to explain to your team that 256 GB of memory is not enough to run your code on the production server?

Let me help you.

It doesn't have to be like this. If you have a good understanding of the garbage collection process and follow a few simple best practices, you can dramatically reduce the memory footprint of your code.

Sound good?

In the last 10 years I have learned the secrets of garbage collection in .NET, and in this course I am going to share them all with you.

In a series of short lectures I will take a detailed look at the garbage collection process. I will show you all of the memory allocation problems you can expect when writing C# code, like unexpected boxing, string duplication, collection resizing, and more. I'll teach you quick and easy strategies to resolve these problems.

By the end of this course you will be able to master the garbage collector.

Why should you take this course?

You should take this course if you are a beginner or intermediate C# developer and want to take your skills to the next level. Garbage collection and memory management might sound complicated, but all of my lectures are very easy to follow and I explain all topics with clear code and many instructive diagrams. You'll have no trouble following along.

Or maybe you're working on a critical section of code in a C# project, and need to make sure your memory usage is as efficient as possible? The tips and tricks in this course will help you immensely.

Or maybe you're preparing for a C# related job interview? This course will give you an excellent foundation to answer any questions they might throw at you.

Meet Your Teacher

Teacher Profile Image

Mark Farragher

Microsoft Certified Trainer


Mark Farragher is a blogger, investor, serial entrepreneur, and the author of 11 successful Udemy courses. He has been a Founder and CTO, and has launched two startups in the Netherlands. Mark became a Microsoft Certified Trainer in 2005. Today he uses his extensive knowledge to help tech professionals with their leadership, communication, and technical skills.

See full profile

Class Ratings

Expectations Met?
  • 0%
  • Yes
  • 0%
  • Somewhat
  • 0%
  • Not really
  • 0%
Reviews Archive

In October 2018, we updated our review system to improve the way we collect feedback. Below are the reviews written before that update.

Why Join Skillshare?

Take award-winning Skillshare Original Classes

Each class has short lessons, hands-on projects

Your membership supports Skillshare teachers

Learn From Anywhere

Take classes on the go with the Skillshare app. Stream or download to watch on the plane, the subway, or wherever you learn best.


1. Optimize memory use in C#: Let me ask you a question. Would you like to become a C sharp performance architect? Okay, I have to admit, I made that word up. But in my book, a C sharp performance architect is a senior developer who writes high performance C sharp coat. So a senior developer who is acutely aware off coat optimization techniques onto rights, called for games for data analysis for real time data acquisition for all these cool environments where fast coat is essential, So would you like to become a C sharp performance architect? Would you like to join the club then? This is the course for you. In this course, I will teach you a tongue off optimization hacks. We will look at the foundation off the dot net runtime. So I will give you a crash course in the stack on the heap. How value types and reference types of stores on the stack, on the heap on how data moves between the stack on the heap. When we're running coat, we'll be looking at memory optimization. So I will teach you exactly how the garbage collector works on what assumptions it makes about objects, size and object. Lifetime on how you can optimize your coat by playing into these assumptions, we're also going to look at some exotic topics, like finalized hours. We're going to look at the dispose pattern. Are we going to look at several pieces of cold? That's that use up too much memory there, way too greedy in memory allocation, so they trigger the garbage collector a lot, and that's again slows down your coat so you will learn how to write code that makes efficient use off memory. So this is a fairly large course. It contains lots and lots of lectures there quizzes to test your knowledge on. You can download the source code that I've been using for benchmarking. So would you like to become a C Sharp performance Architects? Then this is the course for you. So I've created this course for medium level or senior see shop developers who wants to learn how to write fast and efficient C sharp coat to get their career ready for cool subjects like game developments or real time data acquisition. Thank you for listening. I look forward to welcome you in the course 2. Introduction to .NET memory management: Let's take a look at a typical seashore program. The programme will consist off classes containing fields and methods. A typical method will start with song valuables being declared on initialized. Then there's codes to perform some kind of action. And finally, the message ends. The variables in the methods get allocated in the stack and every time when you see the new keywords on objects against allocated in the heap. But here's a question for you. What is the opposite? Off New, Which keywords deletes on objects Some languages actually have a special keywords called delete, which is the opposite of new on explicitly deletes an object from the heap, but the dot net framework does not have it. There's a new key words, but there is no complimentary. Deletes keywords. How is that possible? The answer is that there is a special process called the garbage collector that is continuously deleting objects that are no longer used by your coat. You might be wondering why Microsoft chose to implement such a complicated solution. Why not simply added the leads keywords to the C sharp language and avoids the need for a garbage collector altogether? The reason, they added the garbage collector is very simple in languages that rely on the deletes keywords. The most common programming mistakes are due to the inquiry. Juice off New and Elise locating an object with new but forgetting to delete it. This is called a memory leak, using an object after it has already being deleted. This will cause memory, corruption and deleting an object that has not being allocated. This causes in memory error by adding a garbage collector to the dot net framework, Microsoft has completely eliminated these programming mistakes. It is very hard, not impossible. Mind you two creates the memory league in C sharp or corrupt program memory. But a garbage collector also has disadvantages because half off the memory management's is now fully automatic. Developers tends to lose sight off the memory footprints off their codes, and this can lead to codes that works the way it's supposed to but allocates thousands of times more memory than it actually needs. I am going to show you how to measure the memory allocations off your codes. I will teach you exactly how the garbage collector operates, and I'll share several simple tricks with you that will help you dramatically reduce the memory footprints off your own codes. 3. What is stack memory?: stack memory was simply the stack is a block off memory that is used for calling methods and storing local variables. So let me draw that here on the blackboard. I'll draw the stack as a vertical column like this. No, when I start running some code, that stack will initially be empty. But when Mike Oates calls the methods this happens, the method parameters the return address on all local variables off the methods are put on the stack. This whole block of data is called stack frame. So what happens to the stack when my codes calls another method from inside this method, which this the same thing happens again? The method parameters return address on local variables off the new method I put on the stack right on top off the previous stack frame. So this is why it's called a stack. Information is stacked on top of one another. What happens when my cult encounters return statement? As you probably know, a return statement ends a method and returns back to the calling code. Now on the stack. This is what happens. The entire stack frame that corresponded to the method is removed, but you might be thinking what happened to all the local variables that were stored in the stack frame? Well, they all go out of scope, which is just a fancy way of saying that they are destroyed. So this is an important fact to remember the moment you return out of the methods all your local variables off that methods go out of scope and are destroyed. If I continue on my program and also return out of the first methods, we are back to where we started with an empty stack. You might be wondering what happens if a method calls in other methods, which cause another method, which calls another methods 1000 times well, the stack would quickly fill up with stack frames until there is normal room. Eventually, the stock will be completely full on the DOT net framework Throws stack. Overflow Exception. If you see this error message, it means you have an infinite sequence off method calls somewhere in your code. Let's take a look at some code. I wrote a simple program to draw a square on the screen. You see that I have a drawer square methods that calls a drawer line method four times to draw the four sides off a square. I will put a break, points inside the drawer, line methods and then run my cold. Watch this. Now, at this point in my coat, that stack will look like this. My first call to draw square is here at the bottom off the stack with its four parameters, return address and local variables. Next is the call into a drawer line with again four parameters. Return address on. In this case, no local variables because drawer line doesn't have any in visual studio. You can take a look at this stack by opening the call stack view, which is here. You can see the call into draw square and then into a drawer line. So this window shows you which stack frames are stored on the stack. Right now, as a final demonstration, let me show you a stack. Overflow exception. Let me modify my coat like this. I've modified my coat. So now draw a line calls into a drawer line, which then calls into a drawer line which calls into a drawer line. You get the picture. This sequence will never end. When I run this program, it will create an infinite sequence off drawn line methods calling themselves Let's run the program and see what happens. I'm there you have it. The stack is limited in size. If I have too many methods calling all the methods, eventually the stack will be completely full on the DOT net framework flows the stack Overflow exception. So what have we learned? That'll Net Framework uses the stack to track medicals every time you call the message all of its method parameters. The return address on all local variables are placed on the stack. This entire block of memory is called a stack frame. When you return out of a method, the top stack frame is removed. All local variables go out of scope at this point, and I destroyed. If you have an infinite sequence off methods, calling all the methods the stock will fill up completely until nettle throw a stack. Overflow exception 4. What is heap memory?: the other type of memory inside a computer is called hit memory or simply the heap. Let's take a look at the following line of code Now. Any time the keywords new appears in online, you were creating an object on the heap. This is a fundamental rule. Indoor nets objects are always created on the heap. They never go on the stack. So I've made some changes to my drawer square program. Let's take a look. Previously, the drawer line methods expected. Four interred your parameters to draw a line. But now I have a drawer Polygon Methods, which expects an array off line objects on draws everything in one go. The draw square method sets up a line array with four objects corresponding to the four sides off the square and then calls Drop all a gun to draw everything in one. Go now Remember, in my old drawer square program, I put a break points inside the drawer line methods, and when I ran my codes, the stack looked like this. But now I've made a lot of changes to the program. So what are the stack on the heap going to look like now? Okay, so imagine I put a break points inside the drawer. Polygon methods on run my program. The stack on the heap would then look like this. The parameter cold lines exists on the stack because it's a parameter, but I initialized it with the new keywords. So the array itself is created on the heap, so the variable on this stack refers to an object on the heap. You can also see that the lines array has four elements and that each element refers to a line object elsewhere on the one. So now what happens when the draw polygon method finishes? That stack frame is removed on the lights, parameter goes out of scope and is destroyed. But here is something you might not have expected. The array on the line Objects on the heap continue to exist now. This is an interesting situation. The lions parameter is out of scope, but the objects on the heat are still there. We say that the objects are de referenced because of the variable or parameter that refer to them without of scope. De referenced objects continue to exist and are not destroyed immediately. So here is another important take away the dot net framework will always postpone cleaning up the referenced objects, and this is because cleaning up the heap takes a lot of time on by postponing it for as long as possible. Your code will actually run faster, but eventually the framework will have to clean the heap or we would quickly run out of memory. This cleaning process is called garbage collection, and it happens periodically in the background when the framework starts garbage collecting . It identifies all objects on the heap, which are no longer referenced by any variable parameter or objects in your coat on its de allocates each of them. So what have we learned? Every time you used the new key words in your codes, you are creating an object on the heap. The variable itself may live on the stack, but it will refer to an object on the heap when parameters and local variables on the stack go out of scope. The corresponding objects from the heap I'm not destroyed. They continue to exist in a D referenced state. The net framework postpones cleaning up the referenced objects on the heap for as long as possible for performance reasons. But eventually the framework will start a process called garbage collection. As de allocates all the referenced objects on the hip 5. How are value types stored in memory?: in the previous lecture, we learned about this stack on the heap now. Previously, when I talked about this stack, I showed some code with the drawer line mess. It's that used four integer parameters. Let's take a closer look at an integer in the Dark Net framework. The insurgent type is part of a special class of types called value types, but what is the value type? The value type is a type of variable, where the type and the value off the variable distort together. So if I have a local integer variable with a value off 12 hundreds and 34 that interview type on its value will be stored together like this. So let's take a look at this stack again. Previously, when I talked about this stack, I mentioned all types of data that is stored on the stock. They are message parameters. The return address off a message sounds local variables. So if I have a local variable off type into drew with the value off 1234 it would be stored on the stack like this. You see that the type and the value stored together on the stack now keep this in mind because in the next lecture I'm going to talk about reference types, which is storage differently. So you might be wondering which types in the dark net framework are actually value types. Well, here's a complete list. All numeric types are value types, including all off the integer floating points on decimal types. Also, Boolean in operations and structures are value types. Anything else in the dot net framework is called a reference type, which I will discuss shortly. When people try to explain the difference between a value type and a reference type, you often hear the following explanation. A value type is a type that exists on the stack. However, this is room value. Types can exist both on the stack on the Yeah, let me demonstrate. Previously, when I talked about the heap, I mentioned what kind of data is stored on the heap. Do you remember what it waas? It waas all objects instances created with a new keywords in C sharp. So imagine that I create an object on the heap using the new keywords in my coat on this object has one interred your field containing the value 12 hundreds 34. Now this integer will be stored like this. So you see, I know have the value type on the heap so value types can exist both on the stack on day on the the defining property value types is not where they are stored, but that the value is stored together with the type. So let me finish by showing you to importance additional features or value types. Let's say I have a message in my codes with two variables A and B. Both are introduce. The variable A contains the value 1234 under variable B is zero. Now what happens when I assign it will be Check this out. The value off A is copied into be. Now this is an important feature of value types their signs by value, which means their value is copied over. So now what happens if I compare A and B? They are two different variables that just happens to contain the same value. So how will the dot net framework interprets the situation well like this? The framework considers these two variables to be equal. This is the second important feature off value types. They are compared by value, which means two different variables holding the same value are considered equal. So what have we learned? Value types store their value directly together with the type value. Types can exist on the stack on and believe value types are a signs by value, meaning the value is copied over value. Types are compared by value. Two variables with same value are considered equal. 6. How are reference types stored in memory?: In the previous lecture, I talked about the value types and briefly mentioned its counterpart the reference type. But what is a reference type? Well, a reference type is a type of variable that refers to a value stored on the heap. Not previously. When I talked about the heap, I showed my modified draw square program that had a draw a polygon method, if you remember with a line array parameter, So draw polygon expected an array off line objects. Let's take a closer look at that line objects just to refresh your memory. Here is my coat again. You can see the definition off the line class here. It's a simple data container with two sets of coordinates. So imagine I have a message with a local variable off type line. What would that look like? A memory well, like this. You can see that the variable itself is on the stack, but it refers to a line Objects on the but can a reference type variable also exist on the heap? Yeah, sure. All I need to do is create an object on the heap. Using the new keywords turned half that objects have one fields off type line. Now the memory will then look like this. You see that they now have a reference type variable on the heap, and it refers to a line objects, which is also stored on the what elsewhere, so to summarize. Reference types can exist on the stack on day on the heap, just like value types, but they will always refer to a value on the heap. Let me finish by showing you to importance additional features off reference types. Let's say I have a message in my codes with two variables A and B. Both variables are lines. The variable A refers to a line instance on the heap under variable B is said to know. Now what happens when I assign a to B. Check yourselves The reference off A is copied into be. This is an important feature off reference types. They are assigns by reference, which means that the reference is copied over. You end up with two variables, referring to the same object instance on the hip. So now what happens when I compare A and B? They are two different variables that refer to the same objects on the heap. How will the dot net framework interprets this situation well like this. The framework considers these two variables to be equal. But wait, what about this scenario to reference type variables pointing to two separate objects on the heap. But both objects contained identical data. How will the dot net framework interpret this situation Drinker sounds. The framework considers these two variables to be not equal, so this is another important feature off reference types. They are compared by reference, which means two different variables referring to the same objects are considered equal. But two different variables, referring to two separate but identical objects, are considered not equal. So what have we learned? Reference times can be sensed to the no value Reference Types store, a reference to their value, and this value is always stored on the reference. Types can exist on this stack on day on the, but their value is always stored on the reference. Types are assigns by reference, meaning the reference is copied over reference. Types are compared by reference. Two variables referring to the same objects are considered equal, and two variables referring to separate but identical objects are considered not equal 7. What is boxing and unboxing?: in this lecture, I'm going to show you a mystery. Take a look at this coat. This is a very simple program. I started with the variable A containing the Value 1234. Then I declare a second variable. Be off type objects and I assign a to B in C sharp all times inherit from objects, including integers, so you can put basically everything into an object type variable. But wait Inter jurors are value types, and objects are referenced types. So in memory, my variables are stored like this. Here is my integer variable A with his value off 1234. And here is my object variable B. But B is a reference type, and we have learned in the previous lecture that reference types always refer to a value on the heap here. This is not possible because A is a local variable, so it exists on the stack on day. It's a value type, so its value is also start on the stack. There is simply no way for B to refer to a because both variables recites in different types of memory. A reference type can never refer to stack memory so this program can never work. Right? Well, this one does. I go back to December in studio Andi, run the program. Here we go. Cool. It actually works. But how is that possible? Now, this is weird. Based on what we learned in previous lectures, this program should not work. But yet it does. How is that possible to find out? Let me de compile this program into intermediate language. By examining the intermediate language instructions, we might find a clue. And here it is. Look at this. Intermediate language instruction called books. Can you guess what it does here? I'll draws on the blackboard. Here is the memory layouts with the two variables A and B. Now the books instruction does this. So to make the program work, the Net framework actually copies the interviewer value from the stack, then tax the value into in objects on it places thes objects on the so the variable B can then refer to its. This whole process is called boxing. Boxing happens every time behind the scenes. When you have a variable parameter fields or property off time objects and you assign a value type to its boxing is nice because it kind of blurs the line between value times and reference types. But boxing can also be a pain because it introduces extra overheads in your code. Now you might be wondering if there is a corresponding process called unboxing. Yep, there is. Here is my program again. Let's take a look at the final line. I declare a variable, see off type integer and assign the objects value to its using a typecast. Another bit of magic. Actually, because see, exists on the stack and be the objects refers to an object on the in the intermediate language. The corresponding instruction is here. It's called Unbox. Let me go back to the blackboard on Draw the unboxing process. We started from the box situation with the integer on the Now. Then this happens. Unboxing unpacks the integer on the heat and copies the value back into the variable See on the stack. Unboxing happens every time behind the scenes when you have an object value and you cast it to a value type. So what have we learned? Boxing is the process off taking value types on the stack, packing them into objects as placing these objects on the heap. Boxing happens whenever you assign value type to a variable parameter of field or property off type objects. Unboxing is the reverse process. Objects off the heap are unpacked on the value. Types inside are copied back to the stack. Learn. Boxing happens whenever you haven't object value, and you cast it toe a value type. 8. How does garbage collection work in .NET? in : in the second lecture off this course, the one on heap memory in the fundamentals section, we saw what happens when reference type variables go out of scope. The variables which exists on the stack are destroyed on the corresponding objects on the heap. R D referenced de referenced objects continue to exist and are not destroyed immediately. I briefly mentioned that a separate process called the garbage collector periodically cleans up these objects. So in this lecture, we're going to take a closer look at the garbage collector. What is the garbage collector and how does it work? Let's start with a very simple program. This program only has one message with one local variable objects array. The array is initialized with five objects on. These five objects also reside on the heap adjacent to the array itself. Now, to make things more interesting, let's remove array elements two and three by sitting. There array elements to know the corresponding objects. Number two and three still exist on the heat, but now they are de referenced. There are no references to these objects from anywhere in the coat. What happens when the garbage collector kicks in the door net garbage collector is a mark and sweep collector, which means there are two distinct stages off garbage collecting a mark stage on a sweep stage during the mark stage. The garbage collector marks all the life objects on the heat, so in this example, that would be the array itself. On the objects 01 on four. The objects two and three, are skipped because there are no references to these objects from anywhere in the coat. Next comes the sweep stage all objects which have nots and being marked in the previous stage R D reference. And in this stage they are de allocated from the heap. So in this example, the objects two and three have not bean marked on, so they are the allocated on. This leaves a kind of hole on the heap. The dog net garbage collector performs one additional step after the sweep, which is called compacts. In the compact stage, all holes on the heap are removed, so in this example, the object four is moved up to fill the hole. That object to left behind the mark and sweep garbage collector is very good at locating each and every deal referenced objects on the heap on remove it, but it also has a big drawback. It is very slow. During the mark stage, the garbage collector has to inspect each object in memory to determine if it is life or the referenced. If there are many thousands off objects on the heap, your program will actually freeze for a while as the garbage collector is inspecting each and every object. The process is also very inefficient because long living objects on the heap are checked and rechecked, enduring each cycle because they could be d referenced at any time. So a very long living objects might get checked hundreds of times if it is still alive. The solution to this problem is called a generational garbage collector. The dot nets garbage collector is generational on. It has three generations, which you can visualize as three separate heaps. All new allocations go into the first generational heap called Generation Zero. So if we revisit the test program with the five element array with elements two and three settle no, then the memory layout would look like this. Everything is the same as before, but now all objects are residing in generation zero generations one comes to our anti. The first collection cycle does a mark and sweep and all objects that survived the sweep move to Generation one. So after one cycle, the memory layout looks like this Z array on and Objects 01 and four have survived the sweep and are now in generation one. Now imagine that the program continues at this point. Aunt puts a new object five in array elements to all new allocations. Go into a generation zero so the memory layouts would look like this. As you see, this is an interesting situation. The array resides in Generation one, but it's elements are in generations zero and one. This is perfectly valid. Now the garbage collector kicks in again for a second cycle. All generation one objects move to generation to on the new object in Generation zero moves to generation one. If the program continues and puts a new object six in a rare elements three, it would again go into generation zero. We now have an array in generation to referring to objects in generation 01 has to again proof of the violence, so you might be wondering at this point why is all this happening. Why have these three generations? Well, the answer is very simple. Generations help to limit the number off objects in generation zero. Every collection cycle completely clears generation zero off all objects. So in the next cycle, the garbage collector only has to inspect new objects that were created after the last cycle. Of course, the problem isn't going away. The garbage collector simply moved the objects somewhere else. But here's the key generations, one as to are collected very infrequently. The garbage collector assumes that anything that reaches generation to must be a long living object that does not need to be checked very often, so this solves two problems. First, it reduces the number off objects in generation zero, so the garbage collector has less work to do. Second, long living objects that survive into generation to I'm not checked very often, which is exactly what we want. The generational garbage collector is a beautiful, high performance algorithms, but it has an important drawback. It is inefficient as processing large, long living objects consider a large and long leaving objects will get allocated in generation zero. It survives the first cycle, and the heat gets compacted, which potentially moves the object in memory. Then it moves to generation one. It gets compacted on moves to generation to all. In all, these are two complexions and two moves, So a total off four memory copy operations for a single objects before it arrives in generation to on the garbage collector ignores it for a while. If the object is very large, these four copy operations per objects can introduce a significance performance overheads. So this solution to this problem is to have two separate heaps, one for small objects on one. For large objects, the design looks something like this indignance that are too hips. The small object heap, which works with the three generations we discussed before hands the large culture keep. The special thing about the large object heap is that it does not use generations. In fact, it has only a single generation, which is synchronized with generation to off the small objective. So when the garbage collect were processes generation to off the small of your team, it also runs through the entire large objective. Another interesting facts about the large objective is that it does not compact the during the sweep cycle. It's simply merges free memory blocks together, but it does not do any compaction to optimize the total amount of free space. You might be wondering what determines if an object is small or large? Well, the size threshold is at 85 kilobytes. Any objects at 85 kilobytes were larger goes directly to the large object heap. Any objects smaller than this limit goes into the small object heap. Having these two separate heaps solves the problem off large, long living objects. They no longer need to be copied four times before they end up in generation to, but instead they go directly into the large object heap, which is only processed in generation to and never compacted. And there you have it. The dog nets garbage collector is a generational garbage quality or that uses a mark sweep compact cycle, and it has separate heaps for large objects and small objects. If you think about it, the dot Net garbage collector makes some very specific assumptions about objects and lifetimes. First, it assumes objects will either be short leaves or long lived. All short lift objects should be allocated, used and discarded in a single collection cycle. Any object that slips through the cracks so to speak, is caught in generation one in the next cycle, so any object that survives to collection cycles ends up in generation to and must be a long living object. Also, any object over 85 kilobytes in size is always considered to be a long living object. On looking at the collection frequency off the various generations, it is clear that the garbage collector assumes that the overwhelming majority off objects will be short lived. So I can sum up my memory optimization advice in a single sentence. Do not go against these assumptions. So what have we learned? The garbage collector uses a mark sweep and compact cycle. The garbage collector has two separate heaps for large and small objects. The large object heap on the small objective. The small object he uses three generations. All new objects are allocated in generation zero on progress towards generation to the large object. He has a single generation which is processed together with generation to off the small objective. Also, the large object heap does not compact the heap. To optimize free memory, the garbage collector makes two important assumptions about object sizes and lifetimes one 90% off all objects smaller than 85 kilobytes must be short lives to all objects larger than 85 kilobytes must be long lived and 9. Tip #1: optimise your code for garbage collection: in this lecture we're going to look at several optimization is to make the garbage collector run as fast as possible. But first, let's recap what we learned about the garbage collector. In the previous lecture, the DOT nets garbage collector uses a mark sweep on compact cycle to clean up D referenced objects from the heat. It's use is to separate heaps for large and small objects. The large or jetty on this small object heap the small objects he uses. Three generations, all new objects are allocated in generation zero hands. Progress towards generation to generation zero is collected very frequently. Generations one hands too much less so Generations helped to limit the number off objects in generation zero. Every collection cycle completely clears generation zero off all objects in the next cycle , the garbage collector only has to inspect new objects that were created after the last cycle. The first memory based performance optimization that we're going to look at is to simply limit the number off objects that we creates in generation zero. The less objects are created, the less work the garbage collector has to do to clean up the heap. There are two strategies that you can follow to limit the number off objects on the hip. The first is to make sure your codes does not create any redundant objects anywhere on second to allocates, use and discard your objects as fast as possible so that they are already to be the allocated by the garbage collector in the next cycle. If you wait too long between allocating, using on discarding your objects, you run the risk off the ending up in generations one or two. So for short lived objects, you want your coats to be as tight as possible. Let's take a look at some examples. Here is a code fragment that loops 10,000 times on builds up a string, using a string builder with a cold to the appends method. Can you see the problem with this coat? I'll give you 10 seconds to think. Here's the solution. The problem is with the string concatenation Inside the appends message, you'll remember that strings are immutable, and so the two string message on the addition both creates extra string objects on the heap for every loop it oration. The cold at the bottom avoids this problem by assembly, calling upend twice. The difference is 40,000 less string objects on the heap. That's a big improvement. So here's another example. See if you can spots the problem. I'll give you 10 seconds again. And here is the solution. If you store integers in an array list, the integers gets boxed on. The the generic lists avoids this problem by using on internal integer array instead of an object array, a simple modification to the codes that results in 20,000 less box into your objects on the okay. One more example. A small, static object gets initialized, then lots of other cult runs first on. Finally, the object actually gets used. What's wrong with this picture? I'll give you 10 seconds, and here is the answer. The object is small, but the gap between allocation and use is very large, so there's a big chance the objects ends up in generation one or two before it gets used. The code at the bottom avoids this problem. My first, making the objects non static there are locating it's just before use and finally setting the object reference to know right after use to signal to the garbage collector that were done on that the objects is ready to be collected. If you don't like having no assignments all over your codes, you can also wrap the bottom codes in the message, as have the object of reference. Go out of scope when you exit the methods. That's my favorite solution. The next optimization you can perform is to fine tune the lifetime off your objects. The garbage collector assumes that's almost all. Small objects will be short lived, and all large objects will be long lift, so we should avoid the opposite. Small, long lift, all tex or large shortly subjects. It's instructive to view these combinations on a graph. If I plot the object lifetime horizontally on the object size vertically, I get the following charts. The bottom left and top right quadrants are where you want to be. These combinations off objects, sizes and lifetimes match exactly with the assumptions off the garbage collector. The top left and bottom right quadrants are at odds with the assumptions off the garbage collector. If your code contains lots of objects from these quadrants, you are effectively working against the assumptions off the garbage collector that the performance off your coat will probably suffer as a result from it. So what can we do to get into the correct quadrants? Let's start with objects. Lifetime to re factor large, short lift objects we need to increase the object. Lifetime. There is a very simple strategy for this, which is called object pooling. The idea is that instead often discarding and objects and allocating and new objects. Instead, you reuse the existing objects. Because the same objects is being used over and over, it effectively becomes a long living objects. This is a very popular strategy for optimizing the large objects heap. So let's look at an example. Here is a fragment of codes that allocates a large array list and then uses it twice, discarding on reallocating the list between uses. How would you improve this coat? I'll give you 10 seconds to think about it, and here is the solution. Instead, off discarding on reallocating the list, you instead wipe it clean with a call to the clear message on, then reused the list for the second method, call in the new coat. The same array list objects gets used over and over. It's lifetime is increased on the objects effectively becomes long living. This change improves performance on reduces the chance that the large object heap becomes fragmented. Now let's look at the inverse problem. We have a small, long lift objects that we must refract or into a short lived objects. How would that work? Here's an example. This coat fills an array list with 10,000 pair objects, pounds each hair contains two integers. So what's wrong with this code? I'll give you 10 seconds to think about it. The problem is that the array list is a large object, so it goes on to the large object heap and is assumes to be long living. But the list is filled with tiny pair objects 10,000 off them. All these objects go until the small object heap into generation zero, because the array list is keeping a reference toe each aisam, all these parents will never d reference, and they will eventually all move into a generation. To the solution is to use another popular, re factoring strategy instead, off having one list with two integers in each list element, we break the list of parts into two separate inter generates because an integer is a value type, it will be stored with the array. So now we only have two larger raise in the large object heap and absolutely nothing in generation zero. Problem solved. The third optimization you can perform is to find June the size of your objects. The garbage collector assumes that almost all slow objects will be short lived on. All large objects will be long lived. So if we have the opposites in our codes, school long lift objects or large short left objects, we need to refractor the size off these objects to get them back into the correct charge. Accordance. Let's start with a large short left object to reduce the size off this object. There are two strategies. One split the object of parts in sub objects, each smaller than 85 kilobytes or two. Reduce the memory footprint off the object. Here's an example off the second strategy. A loop fills the buffer with 32 thousands, but it's can you see what's wrong with his codes? I'll give you 10 seconds. And here's the answer. The loop fills the buffer with bites, but the buffer is defined as an array off integers. The buffer holds 32,000 items. Andi. Since an integer is four bytes in size, this adds up to 100 28 thousands bites. This is above the large object threshold, and therefore this buffer goes directly until the large object heap on gets collected in generation to the solution is to re factor the buffer as it bites buffer. Now the memory footprints off the buffer is exactly 32,000 bytes, which is smaller than the large objects thresholds. And so it gets stores on the small objective in generation zero, just like we waas. Now, let's look at the inverse problem. We have a small, long lived object that we must re factor into a large, long lifts object. How would that work? The solution is, to either and large the memory footprint off the object or to merge several objects together to create a bigger objects that can go on the large object heap. So here is the final example off this lecture, this coat declares a static array list on. Then, somewhere halfway in, the program starts to use it. What's wrong with this code? I'll give you 10 seconds. Here's the answer. It's clear that the object is intended to be a long living object because it is declared a static. If we also know that the list will eventually contain at least 85 kilobytes of days, huh? Then it is better to initialize the list to this size. This ensures that the list goes directly on the large project heap because if you do not initialize the list, it gets the default capacity, which out of the top of my head is 16 kilobytes. So then the list goes until the small object heap in generation zero and eventually moves to generation to after potentially, having undergone four memory copy operations by initializing the list to the correct size right away, you a voice, the migration from generation zero to generation to entirely. It might seem strange that you can optimize coat by making objects bigger, but that's exactly what we're doing here. And sometimes it really works. So what have we learned to optimize your code in such a way that the garbage collector runs as fast as possible? You need to follow these strategies first, limit the number off objects that you create seconds allocates use on discards small objects as fast as possible. Third's reuse all large projects. You want to work with the garbage collector and not against it. And so you must ensure that all objects in your codes are either small and short lived. Four large and long lives. So if you happen to have objects that are either large and short lift or small on long lift , you might want to re factor them for large shortly subjects. You can either increase the lifetime or decrease the size off the object and four small, long lift objects. You can either decrease the life sign or increase the size. All these changes will benefit the performance of your coat. 10. Tip #2: avoid class finalizers: in this lecture, I'd like to talk about Finalize er's. A final izer is a classmethod that runs automatically just before on objects gets cleans up by the garbage collector. Finalize. ER's are also sometimes called D structures to signify that they are the opposites off constructors, whereas a construct er initialize is and objects on prepares it for first use a D structure does the opposite aunt cleans up any lingering. The resource is so the objects can be discarded. Finalize er's are very useful if you allocate scarce resource is in your object. For example, on operating system file handle, there are only a limited number off file handles available, so it's very important to release the handle when you're done with the object. Sure, you could implement a close message. But what if someone forgets to call it? It's safer to release the file handle in a final Isar. So that's it is guaranteed to be released when the object is cleaned up in coat that would look something like this. You can see the class declaration at the top on the constructor. That initialize is the object. Andi, below the constructor is a weird looking method that starts with a tilde character. This is the final Isar on. It will be called by the garbage collector just before the object is cleaned up. In previous lectures, I showed you how the garbage collector works. You might remember this slight with the three generations on the array off objects moving through them. Let's take another look at the collection process. On this time, let's assume that each array element has a finalized. So if we revisit that scenario with elements two and three set to know, then the memory layouts would look like this. Everything is the same as before, with all objects residing in generation zero generations. One onto are empty. The first collection cycle does a mark on sweep on all objects that survive this week move to Generation one. So after the first cycle, the memory layout looks like this. The array on objects 01 and four have survived this sweet and are now in generation one. But here's the change. Objects two and three are not cleaned up. They can't be because they have finalize Er's, which still needs to be run, So the garbage collector also moves objects two and three to generation one and adds them to a special finalization que after the collection cycle completes a separate threat called the Finalization Threads wakes up. Andi goes to work, executing the finalize er's off all objects in the finalization Que Let's take a look at some actual coat I've written. I made a small program that allocates objects with Finalize er's so that we can see went. That's how their calls take a look at this code. Here is the main program message that repeatedly allocates objects in a loop. Each objects gets a unique index number so we can track them individually. When I press a key, the loop ends. The application writes a text on the console that it is terminating on the main program. Threads ends, and here is the class declaration off the My Object class. It's a very simple class with only a single interred your fields Onda, a constructor for initializing the integer. The constructor has an extra line here that rights to the console in which garbage collection generation, the objects resorts. Andi here is the Class D structure or finalize er. It sleeps for 500 milliseconds and then writes the current generation on index number to the console. Okay, let's run the program. Here we go. Everything seems to be working lots off objects being constructed in generation zero, just as we would expect. Now I'm going to press a key. You can see the Finalize ER's being called once every 500 milliseconds. The main program threat has ended, but the finalization Q is still full of objects, so the finalization threads keeps going on, finalizes objects one by one until the process finally ends. But look at this first. The index numbers are in random order, so we know that the garbage collector is processing the Finalize er's in random order seconds. All objects are finalized in Generation one and swords. After I press a key, the application finalizes eight objects on, then terminates what happens to the millions off other objects. So by now you should have a hunch that finalize er's are complicated things. Sense are easy to get wrong. Let's run through the observations. One by one, I'll start with the random border off the index numbers. So we've seen that the garbage collector processes objects in a completely random order. This means that your final Isar codes cannot make any assumptions about other objects still being alive. Any reference your objects might hold could already be finalized. This leads to a very important rule When writing finalize ER's a final Isar must only process its own objects and never other reference objects. Here's a good example. Take a look at this code. If the Stream writer is finalized first, everything is fine. The data is flushed to disk on. Then the file closes. But if the objects are finalized the other way rounds, then the stream writer will attempt to write data into a closed file on the application will crush. So this coat would actually crash 50% off the time, depending on the ordering off the finalize er's. But in reality, this does not happen. Microsoft saw this problem coming and decided not to implement a finalize er in the stream writer class. So if you forget to close the stream riser, your code will knows crash. But the string will also not be written to disk. You will lose data, so in summary, you cannot make any assumptions about the order in which your objects will be finalized and you should never try to use a referenced object in your final Isar because it might already be finalized. All right, The second thing we observed from the code is that all objects are finalized in Generation one. This makes perfect sense when you think about it. The garbage collector marks all life objects in generation zero. Andi moves them to generation one. All D referenced objects with Finalize ER's are also moved to Generation one for safekeeping and add it to the finalisation que The finalization threads processes the objects one by one, calling each finalize er on, removing the objects from the queue. After each finalize ER has run. The objects are finally cleaned up in the next collection cycle. We've seen in a previous lecture that the garbage collector is optimized for small, short lived objects on with short lift. I mean on object, whose entire lifetime fits into Generation zero. So here is a big disadvantage off objects with Finalize er's. Their lifetime will always span at least generation zero. On Day one, it takes at least two collection cycles for the objects to finally be cleaned up, with the final Isar executing between collections. And if there are many objects in the queue. It might even take up to generation to before the object finally gets finalized. If you have many small, short lift objects with Finalize ER's, you actually go against the garbage collects or because you are extending their lifetime into generation one and thereby doubling the number off memory copy operations needed to mark sweep and compact deep. Another disadvantage off finalize er's is that any object you reference will also end up in generation one. So if you implement a collection class with the final Isar, this means that the collection plus all items in the collection ends up in generation one before being collected. This leads me to on important optimization rule. You should never adds. Finalize er's to objects at the roots, often object graph only two objects at the edge. Finally, let's look at the last coat observation. We saw that after I pressed key, only eight objects got finalized before the application ended. What happened there? The answer is actually very simple. When a process is exiting, the DOT net runtime gives the finalization threat a maximum off four seconds to process any remaining objects in the finalization que after this deadline. The entire process terminates on all remaining objects in the Q r D allocated without calling their final Isar on this leads me to the final point about Finalize er's. They are not guaranteed to be called if the host process is exiting anything remaining in the finalization que After four seconds, he's discarded. Okay, so let's summarize the disadvantages off Finalize er's that we've seen so far. Finalize er's are cools in a random order. Objects with Finalize er's always end up in generation one on, sometimes even in generation to when I process ends, some finalize er's might not get called, and here is a summary off. The topics we've covered in this lecture finalize ER's are class methods that are calls when objects are about to be cleaned up by the garbage collector. Many small, short lived objects with Finalize Er's are bad because the final Isar extends their lifetime into generation one. You should only adds, finalize er's to objects at the edge, often object graph. Finalize. Er's should never reference other objects. Finalize er's should be extremely short. Hands run very quickly 11. Tip #3: use the dispose pattern: the garbage collector in dot nets is very helpful in that it's constantly scans the heap for D referenced objects and cleans them up in discrete collection cycles. As programmers weaken, simply allocate objects when we need them and forget about them. When we're done, the garbage collector will follow us around, so to speak ons clean up our mess when on object needs to allocate some kinds off scarce resource, Let's say, a native file handle or a graphics device context or a block off native memory, Then the perfect place to do so is in the constructor. The moment the object is ready for use, the resource will also be ready and available. Conversely, the best place to release the resource is in the final Isar. The final Isar methods gets called automatically by the garbage collector just before the objects gets the allocated. Doing, so guarantees that the resource gets released at the exact moment when the garbage collector is about to de allocates the object. However, this approach has a big problem. The time between construction on finalisation can be huge. The garbage collector is slow on tries to run as little as possible. In fact, if enough system memory is available, the garbage collector might not run. That's all, preferring to let the memory slowly fill up over periods off days or weeks, keeping a scarce resource allocated for days. He is a terrible practice that kills all scalability on. What makes it worse is that the object is probably no longer being used and has already beaten the referenced. So how do we fix this problem? This solution is to use the dispose pattern. This design pattern introduces a disposed message for explicitly releasing. Resource is, let's take a look at some cold. Here is a little program I wrote two demonstrates the advantages off the disposed pattern. Let's start with the main program message here. I have two static fields declarations at the top, finalized objects which will hold the total number off finalized objects and total time, which will hold the total lifetime off. All objects combines in milliseconds. Then here in this loop, I allocates 500 thousands objects on a call there to work methods to simulates some kinds off work being done. Then, when the loop ends, I show the total number off finalized objects on the average number off milliseconds that the object was alive. Let's take a look at the without disposed class. It's a very simple class that allocates and starts a stopwatch in It's constructor. The do work message is a simple loop that does some kind off calculation to simulates work being done on the final Isar down here stops the stopwatch and updates the two static fields in the main program. Note that I use the interlocked class to update the static fields in a threat safe manner because there might be more than one threads trying to update the fields at the same time. The interlocked class will make sure that's all. Threads run in sequence and no erased condition can a cure. Okay, let me run the program on. You can see the results. Here we go on. Here are the results. The program managed to finalize 455,000 objects, which means that the remaining 45,000 objects are still on the heap in generation zero waiting to be collected. The average lifetime often object is 736 milliseconds. So this is the average amounts off time between construction on finalisation. So here's our problem we are pretending that are objects, allocates some kind off scarce resource. But those re sources are being held for 736 milliseconds before being released ums. At any given time, there are 45,000 uncollected de referenced objects on the heap waiting to be collected butts still holding on to their resource. This is going to be terrible for scalability. We can fix this problem by using the dispose pattern. Take a look at this class here called with Dispose in which I have implemented the pattern . You can see that the constructor onto the do work methods are exactly the same. But look at this interface here. The disposed pattern requires that you implement the I disposable interface. The inter fence defines a single dispose method in which you are supposed to explicitly release any allocated. Scarce resource is I have implemented my dispose methods here. My message simply calls another protected virtual dispose methods. Down here is the final Isar, which also calls the protected virtual disposed methods. You can see that the final Isar provides a false argument on the public disposed methods, provides a true arguments. So inside the protected disposed methods I can't know distinguish if the code is being cooled from the final Isar or from the public disposed methods. This is very useful because you might remember from the lecture on Finalize er's that there are many things you are not allowed to do inside the final Isar like, for example, accessing references to other objects. So if this Boolean arguments is false, I know that the code is running on the final Isar threat on. I cannot access other objects. But if the argument is true, then this method was called from user codes and everything is fine. I can do whatever I want. There is also a disposed Boolean here that ensures that the disposed coat is only run once . So if for was ever reason someone keeps calling the public disposed methods over and over this coat will only execute once. Finally, I would like to show you one more cool trick. Imagine for a second that the disposed methods has been calls on the objects has released. Its allocated resource is so now there is no needs for the final Isar to run. That's all because there is nothing left to do. Sure, the final Isar wouldn't actually do anything because the disposed variable is already set to true. But remember, a final Isar will extend the life time off any objects into generation one. This will put pressure on the garbage collector for no reason. Fortunately, there is a very cool solution. You can actually switch off the final Isar for any object. This line of codes here will disable the final Isar if the dispose methods has correctly run without errors. So after calling disposed, the objects effectively no longer has a final Isar, the garbage collector can safely discarded at the next election cycle and the object will never get promoted into generation one. Okay, so I am going to make some changes to the program to use the new objects that implements the disposed pattern. First, I will show you what happens when I simply change the type off objects being created ons. Not do anything else. Let me run the program. Here we go. And here are the results pretty much the same as the previous run. 462,000 objects being finalized and on average, lifetime off 639 milliseconds. I now have this cool disposed message, but I am not actually calling it. So the resource is still get released in the final Isar on the average resource. Lifetime is pretty much the same as before, so let's fix that. I am going to change the way the objects gets allocated inside the loop. I will use the using statement which guarantees that the dispose methods gets calls at the end off the using block. So this should do it now. In each new pitch aeration, the object gets created, used on, then disposed immediately. I am holding on to the resource is for as briefly as possible. I'm going to run the program again. Check this out and here are the results. The program has now disposed all 500,000 objects because the loop disposes each objects before starting the next iteration. On the average lifetime off, the resource has dropped to a staggering 0.26 milliseconds. That's 245,000 times less then the code that did not use the dispose pattern. You can visualize the results like this. If this horizontal graph represents time, we can draw a line all the way on the left when the object gets constructed on another line all the way on the right, where the object gets finalized. But this point here is when we are done with the object on about to let it go out of scope . It is vital to explicitly call dispose at this point, because otherwise we would be holding on to the resource is for the entire length off time between de referencing on finalisation, which is 245,000 times longer than the time we actually needed the object in the first place. Here is a summary off what we've learned in this lecture. The disposed pattern provides a message for explicitly releasing scarce resource is the using statements will automatically call the dispose method at the end off the using block . The disposed pattern dramatically reduces the length of time that resource is are held by objects. Calling the dispose methods suppresses the final Isar, which prevents the objects lifetime extending into generation one 12. Tip #4: avoid boxing and unboxing: I am going to teach you how to improve the memory allocations in your coat. In this lecture, we're going to take a look at boxing ons. Unboxing. You might be wondering if boxing Aunt Unboxing has any negative consequences like, for example, increased pressure on the garbage collector. Well, let's find out. I wrote a program to measure the difference in memory allocation patterns between a code fragments that does not use boxing versus a code fragments that does use boxing. Here is the coast. I am going to start with codes that's does not use boxing. So here is the main program method, with a simple loop that calls a test method one million times. You can see that I pass the current nuke index into the test message up. Here is the test methods declaration. The test message, except on interviewed parameter ons, does absolutely nothing. I am going to create a released bills off this program in mono develop by using the built menu. Here we go. Okay, so now I have on execute herbal file in the bin slash release folder off my projects. The next step is to manually run this program on the command line on use The awesome look profiler that is built into the model framework. Let me switch over to the command line now to run a mono program. All you need to do is call the moan. Oh, execute herbal Andi as the first arguments provides the execute herbal file name that you want to run to enable the Profiler. All you need to do is add this configuration parameter here The profile parameter enables the look profiler on the report option indicates that I do not want a log file on disk. I simply want to view the performance report directly on the command line. Okay, so here we go. I am going to order on the program on This is the performance reports. Let's go through the sections one by one, appear he's version data. I am using version zero points four off the profiler, The profiling instrumentation introduced on extra overheads off 60 nanoseconds in my coat which is not that hands. You can see that I run the program on September 4th 2015. Next is the jit section. The just in time compiler compiles eight methods in about a kilobytes off native machine language, and here things get interesting. The garbage collector did not precisely heat. It moved 21 objects during heap compaction. Hans performs two collection cycles in Generation one, which on average took 1146 microseconds each. You might be wondering why there are no Generation zero collections. Well, it's because my program doesn't allocate anything on the heap at all. All the objects in memory are from the dot net runtime itself, and they were created before my program even started on. That's why they all ended up in Generation one. The allocation summary shows a list off all objects allocated while the program was running . The list is full off housekeeping objects, but you can see to inter jours down here. One of these two is our loop variable. So in total, 33 objects have been allocated, which occupy 3496 bites off heat memory. Okay, No, I am going to make a change to the program, check their cells. I am going to change the type off the arguments off the test methods to objects. So now, during each loop, it oration. The integer needs to be boxed on the heap, So that's the object type. Arguments can refer to it. Let's see what kind of impact this code change is going to have to the memory allocation statistics. I'll build the program again to update the execute herbal file on disk on. Then l switch over to the command line on a run. The program with look profiling again. Here we go. Here are the results. Check out the differences. The garbage collector now performed five generation zero collections ons to Generation one collections. The average time per collection is roughly the same, but that's more than twice the number of collections we had before. But the biggest difference is in the allocation summary. I now have £1 million to introduce allocated. Are you buying 24 megabytes of memory? The heap now contains one million on 33 objects spanning 24 megabytes of memory. This is very interesting because you probably know that an integer is only four bytes in length, but a blocks integer on the heap occupies 24 bites, so we're using six times as much memory as the scenario. Without boxing, the difference is quite dramatic. I went from four kilobytes off allocators. Heat memory to 24 megabytes and from to introduce to one million ants too, which required five additional Generation zero collections to remove everything from the hip. So the first rule off improving memory allocations in your codes is avoids boxing on unboxing wherever you can, because it has the unwelcome tendency to absolutely flood the heap with lots and lots of small objects. Okay, here is a summary off what we have learns in this lecture boxing ons. Unboxing is a process that lets you use value types on reference types interchangeably in your coat. Each boxing operation creates one new objects on the heap. A box objects occupies more memory than the original value type. Boxing can flood the heat with lots off small objects on. Put considerable pressure on the garbage collector. Boxing on unboxing should be avoided wherever possible. 13. Tip #5: do not add many strings together: The nice thing about this section is that you can read the conclusion right from the lecture title. So apparently you are not supposed to concoct innate strings together. But why not? What's so bad about joining a few strings together? Well, let's find out. I have written a small test program to investigate the effects that string concatenation will have on memory allocation arms, garbage, collector behavior. Take a look at this. I have the main program message here on all it does is call one off to test methods. The first methods is called string Tests, and it is declared up here. You can see that's all I'm doing is building a 10 thousands character string consisting entirely off hash characters. I starts with an empty string and then in a loop, I add a single character 10,000 times. Now the first thing I want to do is set up a baseline measurements. So I am going to comment out the call to string test in the main message here. So now the program does absolutely nothing, and we can observe how the garbage collector behaves in this situation. We can also determine how many objects are presence on the heap by default. I'm going to build the project using the build menu on, then switch over to the command line so I can run the program with the look profiler that's built into the Mona framework you've seen in the previous lecture. That's all you need to do is call the Execute Herbal with the moan Oh commands on, then provides this profile arguments to enable the log profiler on Set it to display the performance reports directly on the console. Okay, here we go. I'm running the baseline program. No, and here are the results. The garbage collector performs two Generation one collections. There are 19 strings on the heap, which occupy slightly over a kilobytes of memory. Total heap memory is 3496 bytes allocated by 33 objects. So now I'll switch back to model develop a NCAA, meant the call to string test, rebuilds the program, switch back to the command line and run the program again. Let's see what's kind off impact the string concatenation are going to have. Check this else. Look at this. There are now three generations, zero collections on seven generation one collections in the allocation summary, we can see that there are 10 thousands and 20 strings off the heap, occupying about 100 megabytes off memory. There's also a message Cool summary section down here. We can see that we made 10,000 calls to the string dot com cat method, which is exactly as expected. But the conch cats methods, called another methods cools internal allocates. Str This method was called 9999 times at this resulted in almost 15,000 memory copy operations. What is going on here? The reason for this behavior is that streams are immutable objects in dot net. Simply put, this means that string data can never be modified in any way. Now, this might sound a little weird because if that were so, how can methods like replace on Khan cats ever work? Well, the answer is that those methods do not directly modify the string. Instead, they create on entirely new string on the heap with all the modifications. So this line, we'll leave the text variable completely unchanged. Andi instead creates a copy off text before the replacements in this copy on, then return the copy and stories in the results. Variable. This is a golden rule. Off string operations in dot nets string message. Never modify the original string. Instead, they always make a copy. Modify the copy instead on return that copy. As a result. The reason for this behavior is that it makes strings behave like value types, meaning they can be assigned and compared by value, which is very convenient for developers. You never have to worry if to string variables might be referring to the same string on the heap. Also, it allows the dot net runtime to perform some cool optimization on string handling codes to improve performance. So in my test program, here's what happens. I start out with an empty string. When I add the first character. The calm cat methods doesnt modify the string variable. Instead, it makes a copy off the stream, adds the character to the copy and assigns the copy back to the string variable. In the next loop, it aeration. The same thing happens again. The one character string is copied. A second character is added to the copy on the copy is assigned back to the string variable After 10 thousands concatenation. I will have 9999 d referenced strings on the heap, plus one active string with the final result. This is hugely inefficient, so it turns out that strings are actually optimized for fast comparison, and they are not very good at book modifications to perform lots off modifications to a single string. There's a much better class specifically designed for that purpose. The string builder, a string builder, behaves more like what you would expect a string to be a character buffer in memory that you can freely right to when you aunt, a character toe a string builder. There's no copying going on behind the scenes. Instead, the dot net framework simply writes that character directly into the string. So let's modify the program to use a string builder instead and see how that effects the memory allocation pattern. Here is my program again. There is another test methods called String Builder test. You can see it's basically the same coat, but it uses a string builder instead to upend the 10,000 characters. I will change the main program methods to cool the string builder test methods instead. Okay, so now again, the familiar routine off building the program, switching over to the command line on running the program with log profiling enables. Here we go. And here are the results. The garbage collector is back to only two generation one collections, which is identical to our baseline measurements. So the string builder coats places no extra pressure. Well, the garbage collector whatsoever. Fantastic Aunt, Here is the allocation summary. I now have 31 strings occupying slightly over 64 kilobytes. This is a massive improvement. I went from 100 megabytes to 64 kilobytes, which is a 99.9% improvements in memory footprints. But hold on. My string eventually contains 10 thousands characters right on a Unicode character in .net occupies two bites. So why do I have 64 kilobytes allocated by 31 strings? Shouldn't my string just be 20,000 bytes? And the answer is that the string builder initially only has a tiny amounts off character memory available. When the buffer is full, the stream builder allocates a new buffer off double the size ons copies all data into this new buffer. So for 10,000 characters, we progressively go through buffer sizes off one to four eight on 16 kilobytes until we eventually reach 32 kilobytes, which is enough to holds 10,000 Unicode characters. The some off 1248 16 and 32 is 63 kilobytes. Exactly what we're seeing on the heap right now To make sure I allocate only the exact required amount of memory I needs to initialize the string builder to the correct size. If the character buffer starts out at 10,000 characters right away, there's no needs for it to double in size while the loop is running. This will speed up my program on and reduce the memory footprints to implements the change . All I need to do is Aunt a constructor argument here specifying the initial capacity off the string builder in characters. I will initialize it to exactly 10 thousands characters, so everything fits okay, but we built a program switch over to the command line on to run the program with Lok profiling. Here we go on. Here are the results 22 strings occupying slightly over 20 kilobytes. Just as we would expect, I just removed another 70% off unnecessary memory allocations. Here is a summary off what we have learns in this lecture string message. Never modify the original string. Instead, they make a copy. Modify the copy aunt return That's coffee strings are optimized for fast comparisons. The string builder class is optimized for fast modifications. Always use string builders when modifying strings in Mission Critical Code In my test program, switching from strings to string builders reduced memory footprints by 99.9%. 14. Tip #6: use structs instead of classes: in this lecture, I am going to take a closer look at Struck's What exactly are struck's and how do they differ from classes? Many people get confused when they have to explain the difference between Struck's and classes. Let's go through them one by one. We covered the difference between value types on the reference types in the fundamental section to re cup value. Types are stored in line either on the stack or on the heap, whereas reference types always refer to an object that is stored elsewhere on the Web. We'll get back to this difference later in this lecture, where you will see that it has a dramatic effect on the memory footprint on garbage collector behavior. Because strokes are value types, they are assigned and compared by value. When you assign Struck's the dot net runtime creates an entirely new struck's ons. Copies all feels over. Compare this to a class where the reference his copies over Andi you end up with two references to the same object on the Heeb. This can never happen with Struck's, because strokes are value types they cannot inherit from a base class, you are free to implement interfaces but you cannot inherit any kind off implementation. Another somewhat weird restriction is that internal fields cannot have initialize Er's. All fields are initialized to their default value automatically, and you can only override their values in the constructor. And speaking of constructors, a struck cannot have a parameter less constructor. You must declare at least one arguments. The reason for this seemingly strange restriction is that the runtime initialized instructs by zeroing their memory. In contrast, classes are initialized by calling their default constructor. This makes Struck's initialized much faster, and then classes and finally struck's cannot have finalize er's. This is pretty obvious when you consider that struck's our value. Types on the garbage collector, which is responsible for running Finalize Er's will always ignore value types. It can safely do this because value types do not need to be explicitly cleaned up. Think about it. When a value type resides on the stack, it will be cleaned up when the cold returns from the method and the corresponding stack frame is discarded, and when a value type resides on the heap, it will be cleaned up when it's containing type is collected by the garbage collector. This is an important thing to remember the garbage collector Onley marks and sweeps reference types. It will always ignore the value types, regardless, how they are used. Okay, so that was a brief introduction on the difference between Struck's and classes. Now let's see how they behave in an actual test program. Take a look at the following code. I've Reason. A short test program that allocates one million point objects on stores them in a generic list. My point objects are simple containers for an X on a Y value. You often see these kinds of classes in programs that need to track pixels, map coordinates or mathematical vectors. So my main program message down here cools a single test method called test class. The test class methods of here locates one million points classes in a loop. Initialize is them using the loop counter. Andi adds them to this generic list. This is the class declaration for the point class. You can see that it is very uncomplicated just to public fields for the X and Y coordinates . Onda, a constructor to initialize the points. Okay, Before I'm going to run this program, I need a baseline measurements to see what kind of objects are present on the heap by default. So I am going to comment out the call to the test class methods in the main program message . Now the program does absolutely nothing, and we can use this to create a baseline for our profile report. I am creating a release build now, Then switch over to the command line so that I can run the program with the log profiler. Here we go. And here are the results we have to Generation one collections by the garbage collector. There are 21 object moves due to heap compaction cycles on the contains 3552 bites allocated by 33 objects. Okay, back to the codes. I will remove the comments so we can run a measurement for the implementation that uses classes. Let's see what the heap looks like after allocating one million points classes. Okay, creates a release bills switch over to the command line ons Run the program. Here we go. And here is our answer. This codes puts quite a load on the garbage collector. We have four generation zero collections and three generation one collections more than three times as many collections as the baseline measurement. The garbage collector performs 870,224 object moves while compacting the heap. That's a lot off memory copy operations, which will negatively affect the performance off this program on Look at the allocation summary here my array is here, one single instance occupying AIDS megabytes off heat space. This is what I would expect because on objects reference on a 64 bit system is eight bytes in size, one million objects times eight bytes is eight megabytes, and here are the points instances. I have one million points occupying 24 megabytes off heaps space. So in total, my program allocated 32 megabytes off he'd memory. Finally, in the methods called summary weaken view the total run time off the program. It is two points 37 seconds. Now you might be wondering how one million points add up to 24 megabytes. My point class only contains two integers off four bytes each, so you'd expect one million points to occupy only AIDS megabytes. What happened here? The answer is that each objects introduces 16 bytes off overhead, which is used for housekeeping data, so each point is actually 24 bytes in size, three times larger than expected. Let's see if I can make this program more efficient by using Struck's instead of classes, I am going to change the main program message to call the test struck methods. Instead, you can see from the declaration that the method is pretty much the same as test Class eight allocates one million points and stores them in a list. The only difference is that I am now using points struck instead off points class. So what's a point struck? Check out the declaration here. It's simply instruct with public X and Y fields on a constructor for initializing the fields. Watch when I quickly toggle between the class on the structure. See, they are pretty much identical. Okay, back to the main program message. I will change the test message. Call to use Struck's instead, and now the usual ritual creates a bills. Switch over to the command line hands run the program. He had other results. We are back to two generation one collections and 21 object moves, which is identical to the baseline. So now there is no pressure on the garbage collects or whatsoever. The allocation summary shows the array occupying eight megabytes of data and nothing else. So now the total heap memory is only eight megabytes, a staggering four fold. Improvements in memory footprints. How did that happen? To understand the difference, we have to visualize how the data is stored in memory. Let's start with the points class code. A list off one million points classes looks like this on the heap. Each item in the list is an eight byte reference to a point class instance elsewhere on the Web, and each point class instance consists off 16 bytes off housekeeping data. Aunt eight bites off actual field data, so 24 bites in total. This all adds up to eight megabytes for the list on 24 megabytes for the points instances, or 32 megabytes in total. Now compare this to the struck. Implementation, as you know, instructs our stores in line, so this means they are stored directly in the list items themselves. There are no references in this scenario. Each struck has zero overhead and occupies only eight bites off his memory for the two integers. So the entire array is still eight million bytes, but without any additional objects on the so. This brings the memory footprints down to eight megabytes, a fourfold improvements. Because there is now only a single array on the heap. The garbage collector doesn't have to do any extra work. It's simply cleans off the array because the points are stored inside the array. This also cleans up all one million points for free. Finally, let's look at the program runtime. Total run time is now only 322 milliseconds. The reason for this is that this implementation does not have to call one million points Constructors store. One million heap references in the array. Onda move 870,000 objects. Two. Compact, The heap that saves a lot of time. Okay, so Struck's are great, and we should always use them right? Well, no structure, great, but you should only use them in very specific scenarios. Here's a list. Use strokes When the data Your story represents a single value. Examples are a point. A vector on matrix, a complex number, a value part. A new medic pupil etcetera, use struts. If your data size is very small, 24 bites or less. Andi, you were going to create thousands or millions off instances. Use Struck's. If your data is immutable and should be assigned and compared by value, use Struck's. If your data will not have to be boxed frequently in all other scenarios, classes are a better alternative. So to summarize, Struck's our value types. Classes are referenced types. Struck's are much faster than classes because they do not have a default constructor. They cannot be inherited aunt. They are not collected by the garbage collector. Struck's only allocates heap memory for their internal fields, and they do not have the 16 bite overhead that objects on the heat have in certain scenarios. Using struck's instead of classes can dramatically reduce the memory footprint on run time off your coat. My test program with Struck's I located four times less memory and ran seven times faster, then the implementation that used classes 15. Tip #7: always pre-allocate collections: in this lecture, I'd like to focus on an interesting aspect off collections that's often overlooked. But first, let's take a look back at the earlier string concatenation lecture, you'll remember that we looked at string concatenation onda. We discovered that strings are immutable. A pending lots of characters to a string resulted in a terrible memory footprint. Because the entire string got copied on the heap during each concatenation, we fixed the problem by using a string builder instead. A string builder is a character buffer in memory, with a certain capacity that you can directly rights into replacing strings with string builders made a huge improvement to the memory footprint off my coat. That's do you remember what happens when I ran the string builder codes for the first time . My 10 thousands character string ended up occupying 64 kilobytes off heaps space. The reason for this is that the string builder initially only has a tiny amounts off character. Memory available Andi. When the buffer runs out of space, it's simply allocates. A new buffer off Twice the size ons copies everything into the new buffer. Consecutive resize is off 1 to 4, 8 16 and 32 kilobytes resulted in 63 kilobytes off allocated heat space, much more than I actually needed for the stream. Now here's an important rule. Every collection on list indoor nets does the same thing. A list starts outs with a certain default capacity on when it runs out of space, it will create a new list off twice the size hands copy everything over. This will inflate the memory footprints off our codes and negatively affects performance. Because off all the memory copy operations happening behind the scenes, let's do a bunch of measurements to find out how much memory overheads list resizing will introduce. Take a look at this code. I declare a bunch of collections up here. Honore list a que It's Stuck, a genetic list on the dictionary. My main program, Messes, calls a single methods called in its collections, which adds a single item to each collection. This is necessary because some collections implements lazy loading. They initialize their internal storage only when you add the first item. So to force each collection to initialize, I need to add at least one item to each of them. Now the next step is a bit funky to find out what the default capacity is. I need to look inside the collection classes at their internal private implementation. There's no way I can accomplish that in cold except by using reflection. But a much easier way is to run my program in debug mode, set a break point and then look inside the collections with the debunker. So I am going to put a break point here and from the program. Now I can use the watch window to inspect each collection. It all starts with the array list. The internal array will be a private member, so I need to expand the non public Members folder here on here. It is on Object Array called items with a length off. Four. Next is the cue. The internal storage array is again a private member called Array, with the length off four. Next is the Stuck. The internal storage array is on interviewed array called Array, with a length off 16. Next, the generic list. The internal storage array is, um, interviewed array called items with the length or four, and finally, the dictionary. The dictionary has many internal arrays, but I'll focus on this one here called key Slots, which has a length off 12. So to sum up a ray list four items que four items stuck. 16 items list four items. Dictionary, 12 items. You can see that these are tiny initial capacities. If you start adding hundreds or thousands off items to a collection, it will have to resize many times to accommodates all items. Let's see exactly how big the memory footprints can get. I'm going to modify my program to fill the genetic lists with a few hundreds thousands items. So down here in the main program message, I am going to comment out the call to in its collections. Andi. Instead put in a call to the fill list method checkouts. The film list methods. Here I use a loop to add exactly 262,145 integers to the list. So how much memory is required for the list? Each item is a four bites integer, which is a value type that is stored in line in each array item, So the total memory footprint is going to be four times 262 thousands. 145 is 1,048,580 bites or pretty much one megabytes off storage. Okay, let's test that theory. I am going to create a release build aunt, run the lock profiler to find out exactly what the memory footprint is going to be. Here we go. I'm running the program now. Hands here on the results. The integer array occupies 4,195,168 bites, or slightly over four megabytes. What happened here is that the genetic list kept re sizing its internal array over and over until eventually, all 262 thousands items fit in the array, but I did not pick that number at random. 262,145 times 14 byte integer is exactly one megabytes plus four bites, so the buffer needs to expand to two megabytes to feel or items ons because it's already cycled through all proceeding powers of two, the total allocated heap memory is four megabytes. This represents the absolute worst case scenario. All items except the final one fit in the array, so the list has to double in size to accommodate the final night. Um, you are left with a memory footprint that is four times larger than what is actually required. The solution for this problem is super easy. All you have to do is initialized the list to the correct size. Let me do that right now in cold. So all I have to do is change the list. Declaration off here, Andi, add the marks item's value to the constructor off the list. Now I need to create a new release build, then change over to the command line on to run the program again. And here are the results. The memory footprints is back to one megabytes, a four fold improvement. Let me summarize what we've seen in this lecture. All collections indoor nets are initialized to a tiny default capacity and automatically double their size when they're full. This behavior can result in a worst case memory footprints that is four times larger than necessary. To avoid this problem, you must initialize the collection to the correct size 16. Tip #8: do not materialize LINQ expressions early: in this lecture, I want to focus on unexpected pitiful when writing link queries that can greatly inflate the memory footprints off your codes. Link was introduced in C Sharp version four. Andi. It's a very nice, clearly language, somewhat similar to SQL for performing complex queries on innumerable data sources. Link has many built in functions for filtering, projecting aggregating on joining data. Siri's Here's how it works. Behind the scenes link can operates on any data that implements the I innumerable interface . Innumerable data is basically a collection off items that you can step through one by one. The I innumerable infects contains only a single method called Gets in Numerator, for instance, hating on in numerator to step through the data. The in operator objects also has its own interface. Cold I in numerator. It contains only three members. Move next to move to the next item in the series current to retrieve the current item, Aunt Reset to jump back to the beginning off the Siri's the end Numerator implements a kind off forwards only curse all for retrieving the items in the Siri's one by one. When you use the for each statement, The compiler creates on in M aerator behind the scenes to step through all items in the Siri's one by one. The nice thing about Link is that it can stack operations on top, often in operator, without actually running them. Consider the following code this expression prepares. Arrange off 500 odds numbers, but it doesn't actually generate the numbers yet. All it does is create on in numerator Andi. Add a filter expression to it when you step through the range. For example, by using the for each statement. The N M aerator uses the move. Next message. Aunt applies the filter expression to return the correct values. Another interesting scene to remember is that the range does not occupy any memory. The in operator, Onley, tracks the current number. Andi. It has a filter expression for calculating the next number in the range, so the total memory footprint off the in operator is only the size of a single Eisen. Even though the range describes 500 items, you can use link to process very large amounts of data while only using a tiny amounts off hip memory because the numerator will only ever track the current item. But link can behave in unexpected ways. Consider the following program. I wrote a spellchecker in C. Sharp. You can see that this project is called bad spell checker because it's liberally allocates . Memory on the heap on produces a huge memory footprints. Let's take a look. I'll start with this read file method. Here it opens a text file reads it line by line on uses the yield return statements to return each line as a new item in an innumerable sequence off strings. In effect, the entire methods gets turned into an enormous crater. The yield return statements returns. The current value on the move next method will advance the while loop to the next line In the file down here is the main program message. I read the file all words, which contains a dictionary off approximately 150 thousands, correctly spelled words. Then I load another file calls story, which is the first line off the vic PD. A article on the country off Spain. Then I call the spell Check Methods, which is declared up here. Spellcheck uses a link query to step through each words in the story. For each words I generate the lower case version off the words and then project a new to apple, consisting off the original words on and a Boolean value, indicating if the words could be found in the dictionary again. This line of code doesn't generate the sequence. It's simply build a very complicated in numerator to or run a spell. Check on each words in the story. The sequence is finally being generated here in the for each statements. I step through each pupil in the results. Set the console color toe either green or red, depending if the word was found in the dictionary or not, and then display the words on the console, plus a trailing space. So this program should display the entire story. Words by words Fans highlights any spelling mistakes in red. Let's run the program to check if everything works, just like in previous lectures, I will generates a release build Andi. Then I'll switch over to the console aunt. I run the program on the command line with the Log profiler, so that's we can look at all the objects getting allocated on the heap. Here we go. You can see that the performance off this code is not great. This is like one words pair of seconds or so. At this rate, it would take forever to check the entire Wikipedia article. So I guess I'm lucky that I'm only checking the first sentence. Okay, so let's wait until the program completes. Here are the results and they are not good. The garbage collector performs 2.6 million object moves while compacting the there were 24 generation zero collections on 12 generation one collections. In the allocations summary, we can see that I have 2.7 million strings on the Heeb occupying 128 megabytes. I also have 349 string arrays occupying another 78 megabytes total. Allocated heap memory is over 200 megabytes on the method called summary shows that the total program runtime is 38 points five seconds. What happens here? Let me show you wasn't wrong here. I am going to switch back to Mona developed Now look at this to list method here. I need the message because I have to check if the words in the story appears in the dictionary. I do that with the contains method. But to call that methods I first needs to convert the dictionary words in admiration into a list that Aiken search. Now here's the problem. The dictionary variable itself is on in M aerator That loads the word list on demand. So for each words in the story, this code is going to reload the entire word list on converted to a list just to check that single word. I have 18 words in my story, so I end up having the entire dictionary in memory 18 times. There are 145,000 words in the dictionary, so that leaves me with 2.7 million strings. Occupying 128 megabytes. The dictionary list uses on internal stirring array for storage. The dictionary requires slightly over one megabytes off storage, so the list will have grown to two megabytes capacity on all the previous discarded string arrays add up to another two megabytes, so each dictionary occupies four megabytes off his memory on. Since I have 18 off them fizz as up to 72 megabytes off heat memory. Together, this adds up to eight, stunning to hundreds megabytes off heat memory, A massive waste off memory on this is the pitiful I was talking about. It is not immediately obvious from the dictionary valuable that it is actually on in operator that loads the entire dictionary on the mound. The spell checker cold looks reasonable, but only if the dictionary is actually cashed entirely in memory, so we can quickly query it. And this is not the case here. When you are writing complex link queries, it is very easy to lose. Track off the implementation off the enumerators you use. How complex is the implementation off the in operator? How many items does it expose? You will need this information before you can decide how to. Clearly the sequence effectively link implements a very abstract clearly layer on top off C sharp coat, and sometimes the implementation off the underlying innumerable data can come back to bite you in unexpected ways. The good news is that it is not hard to fix the spell checker. I did this in another project called Good Spell Checker. Let's take a look. The good spellchecker is very similar to the bad one, but a notable difference is that I declare the dictionary to be a generic list off strings up here this change removes the need for the to list methods. Call in the spellchecker coat. Note also that I initialize the list. 250,000 items. This prevents the list from doubling in size while adding items. Andi removes all those d referenced string arrays on the heap. The spell check message is almost exactly the same as the bad coat. The only difference is that I can now call the contains methods directly on the dictionary variable because it is already a genetic list. The final change is in the main program message down here. I now initialize the dictionary by using this for each statements. Okay, let's see how this version matches up. I am creating a release build switching over to the command line on running the program using the log profiler. Here we go. And here are the results. We are down to 145,000 object moves. One generation zero collection on to generation one collections much better. The heap now contains 145,000 strings. This is our dictionary occupying 6.8 megabytes off heap memory. I also have nine string arrays occupying 1.2 megabytes. This is exactly what you would expect. A string array with 145,000 elements each on eight byte heap memory reference will occupy slightly over one megabytes off he memory exactly what we're seeing here. So this implementation will only allocate the word dictionary on the heap on nothing else. This brings down the memory footprints to eight megabytes, a massive 25 fold improvement. So what have we learned? Link is a powerful framework for running queries on innumerable data calling the to list methods in a link, expression can unexpectedly inflates the memory footprints off your coat to fix this problem. Cool to list before running the link query In my spell checker codes pre initializing the word dictionary resulted in a 25 times smaller memory footprint. 17. Course recap: congratulations. You have completed the entire course. You are now a certified C sharp memory optimizer. I have shown you How does Nets garbage Collection works? How to optimize your own codes for garbage collection? Andi I demonstrated several simple tricks to dramatically improve the memory footprints off your coat. We took a detailed look at garbage collection, identified the assumptions the garbage collector makes about objects size on Lifetime and researched how you can optimize your own code to work with these assumptions, we also covered memory optimization. We measured the effects off several simple tricks where a tiny code change resulted in a huge improvements in the memory footprints off the coast. The skills you learns have given you a rich toolbox of knowledge and ideas that you can use when writing your own codes or when collaborating in a development team, especially when you're working on Mission critical cold, where low memory usage and fast performance is crucial. If you discover some interesting insights off your own, please share them in the course discussion for him, for us all to enjoy. Goodbye. I hope we meet again in another course