Wednesday, October 05, 2016

CODE GENIUS - OMG Ruby and Rails Performance by Aaron Patterson

i think i only have 30 minutes tonight but I've got a hundred and sixty days
it's cool because we can do this anyway if there's if there's one thing that you
can learn tonight like this is the this is the one thing that you need to learn
is on a watchband if you go to preferences keyboard text will bring up
this menu here you can enter in you can enter in little shortcuts for whatever
you want to get it so on my online machine when i type when I typeface it
turns into that or five type in heart so i can get hearts out to so I just show
you a little demo on your side just do i do face hearts
this is what I do all day
what I'm paid to do so i wanna i wanna i really want to start out with an apology
i need to apologize to all of you I know that you came here tonight to see a
coach genius and said you got me i think the word luminaria was you do this i'm
just not very bright
yes so what I'm really trying to tell you is the lower half asian so I want to
say thank you thank you for having me thank you to coach genius please give a
round of applause really happy to be here great to be in New York this pocket
titled oh my god movie and rails performance one so warning this I have
to give you a warning this is not a soft taco all this is a technical topic we're
going to be talking about technical stuff i'm going to do it try to do it in
the 28 minutes that I have left this is not going to be a talk about how to find
yourself because I'm right here i have found myself hopefully I'm not lost i'm
gonna be talking about a lot of weird stop weird a technical stuff and the
reason I'm going to do it is because it's i love this stuff i love it so much
so this this top may be fun for me and not for you just had just a warning
ok anyway thank you thank you
so I like I said I just think about weird things all the time and most of
this talk is just going to be about weird stuff I think about like like well
for instance my cat this is one of my cats
her name is 22 her real name is seatac named after the airport yes and then
she's not very famous at all but I love her and then this is my other cat
Gorbachev he's the more famous one has a lot of Twitter followers also like my my
wife and I went to gets get some photos taken holiday photos I thought I just I
mean it i thought i'd share them with you this is this is about your wife I
thought that kidney
yes we did do this at jcpenney
it was amazing especially explaining what we want
as far as don't just take themselves right then
alright so we're going to be talking about rails performance and rails five
in action room so i'm going to talk about i'm going to talk about bugs some
bugs i'm going to talk about test I'm gonna talk about performance so it's not
really going to be about rails five so much as just like bugs and performance
and weird stuff just stuff from all different places and i just want to give
you all of you ideas to think about or you know things you want to ask me
just you know just to get your mind's going
the first thing I want to talk about is a testing idea that i have and you might
have if you read my blog page that was up here a little bit earlier which by
the way my website is like basically the best troll him ever
I'm getting people to say that their lovemaking and professional setting
it's so hot anyway so at work we have slowed tests
I don't know you may have slowed test to work or you may not have tested all and
then in that case are really fast infinitely fast but we have that we have
slowed test and I sit there and I'm like like home i'll write some code and I'm
like okay uh wrote this code and now I gotta run all these past and it takes
like 20 or 30 minutes to run them and I'm just really annoyed like I'm just
super annoyed and part of the reason i'm annoyed is because I'm like you know 99%
of these pests aren't even running the code i change they're not even touching
and it just like it just annoys me to no end because I'm wasting my time why am i
running all these other tests so what I want to know is I want to know if file a
and line b as modified which tests to run
like what you know what time should I be running when I modify this code
this is called regression test selection and thankfully i gotta patch in the Ruby
to do this we can we can now do this using trunk Ruby and i'm going to give
you an overview of how to do it this is this is not a complete solution but if
you get trunk Ruby this is built in the way that you do is you in the general
case they say okay we're going to start our coverage Ruby you may not notice of
a ruby ships with a code coverage tool built in you start the coverage and then
each time you want to get like get a result of the coverage you can say peak
result you call peak result and what that does is it gives you a snapshot of
the coverage right now at this instance what is our code coverage and then you
can run the test and get get the results later and you can say well I knew the
coverage before I ran the test and I know the coverage after I ran the test
so i should be able to calculate what lines of code this test actually
executed so we can calculate the difference between those two and i wrote
a nice little like this is please don't read it I know it's tiny and you don't
need to read it this is just how to do with many test me test extension and
what I want to do what i want to do eventually i want to be able to do
something like this or just say like get diff pipe what has to run right like
that's what I want to know is like I sit there hack away do my thing I'm like
okay get the pipe what tests are run and then just run those tests so I put
together a nice little a little prototype
i'm going to show you hopefully you can see this basically what i say is I got
two files in there and I require one of the files this actually get builds our
coverage information at first you'll see you when I do get status again for a
second time there will be a json file and that contains all the coverage for
all the test runs so that i can modify the test and just like make it blow up
go Aaron come on raise an exception learn how to use your editor please see
you only have 20 30 minutes anyway so raise exception and then if i run this
program want to run so it will predict which test will fail you'll see an
output here
those two tasks so test record skip and test record failing it's predicting
which test will will fail without actually running the test and i'm just
going to run a full full test suite again just to show you that my
predictions were actually correct
so those are the only two tests that failed and i was able to predict those
two tests failed based on the coverage data that I had before so i would like
to do is have this is more of an automated solution where it's like okay
you know only run those two tests i changed the files i only manipulate this
bit of code only run those tests for me that way i'm only running you know to
test rather than the 4000 that have no impact on my coat at all so please
someone steal this idea for me because i really like it and I don't want to
maintain it
it's so if you're looking for some open source to do here is something you could
do I think this would also be useful in other situations like other questions
that we can answer our what test run this line so even if you don't modify
your code you can just say like okay
given this particular line of code which test will run this particular line so
you know that if you modify that those are the tests you probably want to run
another thing would be like well uh what code does this request Iran so if you
could actually run this on one server and production like let's say you
inherit a giant let's say in theory you have hit you somehow inherited giant
application and you don't know what you don't know what code is actually used if
you could use this in production in on one server to say like okay log-log the
code that's actually executed for this particular quote request
maybe you could discover which code paths in your in your application are
actually important so I think that would be a very interesting use for this this
particular thing so if you have time and you need something to do
yeah so all right i want to talk about memory leaks and we're going to talk
about memory leaks and rails but i'm not saying so memory leaks when I say memory
leaks I'm actually talking about object leaks and the it's a subtle difference
but nerds out there will give you a hard time if you don't know this difference
which is essentially a memory leak is memory that you've you allocated and
then is never freed you never called the free function so it just keeps growing
and growing an object leak is when you have objects but you're pointing at them
from somewhere so they never get garbage collected right the garbage collector is
in charge of d allocating those objects and somewhere somehow you're keeping a
reference to those forever and it just keeps growing and growing
so what I'm going to talk about his object leaks not really memory leaks but
when you look at it your memory does grow and grow and grow so it looks like
a memory leak
I and this is going to talk about bug number 776 it is number 776 on the rails
issue tracker on github and these were all imported from lighthouse and I
believe that those this particular bug was actually imported from track so this
but visit
to give you an idea of how long this freakin bug has been around so what it
was if you did if you did some code like this you say open a transaction just
create users infinitely inside that transaction if you graph the memory of
your memory usage over time this is what the memory would memory usage was it
look like this is not an actual chart of memory usage that is this is it this is
just creative interpretation so the question was why did this leak and the
reason that this code leak is if you went and looked at the you looked at
rails internals this is my actual code BTW this is this is what the code may
have looked like at the time when we created that so we we open a transaction
will you yield you the block and as soon as your active record object is created
we added to this list of records right and this list of Records just kept
growing and growing and growing because your loop was Infinite never ended and
the reason we we held onto those things is because well if there's a database
error like let's say your transaction failed or something like that
we need to roll back the data we need to roll back what happened
so what I mean by this is let's say you open a transaction you create a new
record before you save the record the ideas nil but after you save the record
you actually get an ID back for it so you might need that that ID now let's
say something happens
razo know an error like there's an error the rescue and you say you're not the
boss of me that you
I to it i want my hot code anyway after you rescue the ID needs to go back to
nil right we need to roll back what happened because that record didn't
actually get saved to the database so that's why we have to keep track of all
those records with that were created so in case something goes wrong we can roll
back to that new now the way that this was working as we have a transaction
object in that transaction off transaction object is pointing to all
these records and the root like the root of the process is pointing at the
transaction so our transaction is a root object and that object is holding
reference to all these other objects and that's where we're getting an object
leak so this is these objects are being created infinitely so how do we fix this
well the way that we fix it and what was suggested back then was to use a weak
reference what the weak reference does is basically put a nice little dotted
line there you see those lines are dotted that's the difference what it
actually means is if you're not holding onto one of those records it will
actually garbage collect them right so it will come along and say okay we're
going to garbage collect those we up and then that way we look at each reference
and we only roll back the live objects we only rollback objects that you're
holding a reference to so unfortunately back then uh weak references didn't work
across all Ruby's it didn't work on MRI at all so we just didn't do it just SAT
there we just said nope it's just going to sit there and that's why this bug had
been sitting around forever and ever and ever and finally one day I had an idea i
said well you know instead of having the transactions . all the records why don't
we just flip those areas around and say have every record . back at the
transaction then if you're holding a reference to two records all those other
records that you're not holding a reference to will get garbage collected
because our references are pointing back up towards the transaction
now unfortunately this makes some of the implementation of active record a little
bit more difficult so for example if you look at the ID method this is the ID
method on your active record object it has to say hey my inside
transaction if I'm inside a transaction has been rolled back if it's rolled back
i need to return nil otherwise i'm just going to return whatever my idea is
supposed to be right so i have to say i don't like weak references and for a
while I couldn't understand why didn't like them I just had this gut reaction
like no no I don't want to use weak references
he looks like I don't want to use recruit I just don't want to use them
and people like no this is the solution really
and finally I figured out how I can articulate my by hatred towards them
not really hatred I don't hate them I i can articulate my annoyance with him and
i'll show you this is this is what the code looks like to say okay let's say we
were to use a weak reference a weak reference implementation instead of
pushing the record your newly created active record object onto that list we
would push a weak reference that wraps your record right and then if there's
any database transaction errors we loop over all those weak references and we
say hey are you alive if you're alive then we'll go ahead and roll your back
but if you're not alive then we won't roll your back and what's nice about
this is that actually reduces our I the implementation of ID we don't need to
have all this in transaction shenanigans we can just say hey go ahead and get the
attribute and then we're good to go
so implement it makes our ID implementation much simpler
so this is what our graph looks like our graph would be a bunch of Records and
with all these dotted arrows and the dotted arrows are still there for some
reason so let's talk about why the dotted arrows are still there
let's consider that we had an infinite number of Records what would that mean
we had an infinite number of records that would mean that we also had an
infinite number of weak references now the question is are weak references free
is allocating a weak reference free and the answer is no unfortunately not
this is how you can measure it you can require this thing called object space
will require a weak reference we can allocate a weak reference and we can
actually measure the size and MRI of a weak reference and you can see there
that it's 40 bytes so if we allocated in
in a number of weak references we would still use an infinite amount of memory
because we're still holding on to all those weak references so the interesting
thing is we still have a memory leak it's just that it might leak slower
because the size of a we craft might be smaller than size of an active record
object and to me a slow leak is still leak it's still above the other thing
that really bothers me about this implementation is that you have random
execution your code executes randomly and what I mean by this is if we look at
that code we see ok if we have that conditional there that says if we graph
lives right we say if it's alive then we'll roll back this object but the
problem is that objects can be collected at any time are collected at random we
don't know when they're going to be collected so if you look at this code
you think okay well when is collection going to happen on the object randomly
it means that this this is going to return true randomly and that bothers me
a lot
the other thing is that we have random performance so if you look at this loop
here we have 0 and performance on this we have to loop over every single object
if you added you know a hundred thousand records to this thing we would have to
loop over a hundred thousand times and it might be more expensive depending on
whether or not the object is alive so since that's random and its own
performance and it may be more expensive when there's an active record object we
can't tell what the performance of this particular loop is going to be so I sit
back and I think to myself oh my god imagine a bug report we use this
implementation imagine the bug report that comes in its going to start out
like this it's going to start out sometimes
and I don't know about you but as an engineer that that is like the worst
word that anybody can stay sometimes myth that's right up there with you know
can't you just love that one can you just so I people wonder should I you
should i use weak references in the answer as well maybe it depends you know
you need to look at your look at your you know your particular use case and
think very carefully about it but just remember that simple code may not be so
okay so we got it using weak references we were able to get a more simple
implementation of the ID method was much more simple but the thing the price we
have to pay may have been a little bit too high anyway weak references are not
in rails that bogus fix don't worry about it or not it's ok alright so i
want to talk about integration tests next
so let's look at I want to look at something this is this is a controller
tests and rails this is a controller tests
ok controller tests here is the integration test
alright controller tests integration test pilot and raced out i was thinking
to myself why don't we write control their Ted those look exactly the same
they look exactly the same and I kept thinking about it thinking about it the
only reason I could think of is because integration tests are slow
they're too slow I then I just thinking about even more like well okay
integration tester slow maybe that's the reason nobody writes them
I mean I certainly don't write them because they're slow well then I thought
to myself you know if if the tests are slow is our website slow too and I don't
think that's true i mean my websites fast enough to serve up request i'm
happy with it the customers are happy with it if it's fast enough for my users
why isn't fast enough for my test something doesn't add up to me when I
think about these things it seems to me that if the website is fast enough for
the users the integration test should be fast enough for test week too so what I
want to do for rails five is increase or well speed up our tests have faster test
and i eventually want to delete functional
as well not actually delete them not actually in a delete them but what I
want to do is implement functional tested terms of integration test so i
want to get integration tests so fast that you wouldn't know whether it's an
integration tests are functional tests and then just replace the functional
test framework out from under you and then delete a bunch of code so wondering
myself okay why is it so slow
why so slow why is our integration testing slow and I have to give a little
bit of credit to Eileen she wrote most of the benchmarks that go along with
this so let's look at these look at the bench marking code look at tools for
analyzing the speed is code figure out why it's so slow
so if we need to know how slowed is I like to use a gem called benchmark ipsi
install use this gym what this does is it measures iterations per second this
is the report that we use this report just measures the difference between
integration tests and a controller tests in fact that test that i just showed you
a few slides earlier this just compares the two and if you look right here it
does a comparison and print out the comparison if we run this test will see
that integration tests are 2.47 times slower so almost 2.5 times lower than it
than a functional tests so okay we know how slow it is but what is it that slow
like how do i know it mykko what what part of it is low and for that I used
stack prof. this is this is a another gem that's built on top of Ruby and I'll
talk about that a little bit
this does a this particular test is testing the performance of a controller
not integration test controller tests i want to look at the performance of
controller tests then look at performance of integration test and see
if i could see a difference if i could see the difference between those two
maybe I can figure out what's wrong
so here we're testing the controller and we're doing a cpu profile on this so
this is running in cpu mode and it's going to output it's profiling results
to this file called stack profit dump that we can look at later
now I want to talk a little bit about cpu time versus wall time the / this
perfect tool or stack croft gem will run into modes that runs in walk mode and
cpu mode and the difference is that cpu mode only measures time your code is on
the cpu it doesn't measure the actual wall time so to give you an idea that
for example the sleep call it slow like if you said sleep 10 it's going to sit
there for 10 seconds that's very slow but it's not using the cpu right it's
just sitting there so that's the main difference is that wall will tell you
that sleep is being used in CPU will not so this is an example of running it with
wall all we do is just switch cpu the wall that's it that's the only
difference then we can look at the stack you stack for off on that dump file and
you'll see that this is the top of our stack or this is what our stack looks
like and we're spending about if you read this closely we're spending about
fifty three percent of our time inside of many tests so many tests runnable on
signal that was the top in our profile
ok so what I wanted to do is I know that that's the thing that I need to target
so I go and look at many tests look at on signal it does a bunch of stuff and
I'm like you know what
delete it
don't need that with that just to leave it so like okay great
deleted that run the test again and I see okay that that is out of the stack
trace no it's not there is not in the frame so amazing or fifty percent faster
right automatically fifty percent faster so i was pretty pleased with myself very
very happy
there's a picture of me and I was like well okay you know I ran it is worth
fifty percent faster but i don't really know if it's I don't feel fifty percent
faster let's time it so i used time so I runtime on the benchmark and this is
before my patch before deleting that code and this is after and you'll notice
it is not fifty percent faster you can make I've noticed here so then this is
my face now unfortunately the thing is this
this program was cpu-bound if you run this code and you look at Activity
Monitor you'll see it's using a hundred percent of the cpu and what that means
is if you're using a hundred percent of the cpu than the wall time should be the
same as the cpu times because your CPU bound so the wall the wall report should
be similar to the cpu report if i run so this is what the cpu report looks like
you'll see the top frame there's many tests on signal and then the wall time
hoops I made a typo up there should be a wall you'll notice if you run with the
wall time that's not there at all so that frame wasn't there at all so I
contacted a bond who wrote the tool and I said hey dude i'm working on this
thing doing some performance for testing and you know it says i'm spending fifty
percent of my time and many tests when I delete that doesn't do it tell the whole
story and he's like you know I'm like this is it this is extremely weird and
he's like that is weird
yeah thanks dude he's like I don't know he didn't know and the reason he doesn't
know is because of the this gem is actually built using Ruby capi if you go
look Ruby actually provide capi is for getting this particular information all
his gem does grab that info and return it to us so he grabs that info and
returns it to you in a nice report essentially so i contacted kochi who
wrote the Ruby virtual machine
and talk to him about and I showed him the different you know I went through
the same thing I was like this is what it does it seems really weird and he's
like yeah that is weird and I like a okay great so we start debugging it
together and we're going to no debugging debugging debugging and it turns out
that it is a bug in a trap to trap call but it only exists on OS pen
specifically so if you run the same benchmark on linux that's totally fine
anyway so the point here is that you can profilers impact but it's so we made all
that changes nothing nothing happened didn't approve it all so I'm really
really sorry i totally wasting your time
let's get on let's get onto something useful
alright alright so will actually use that will do something useful
so let's profile integration tests we do the same thing but this time we use wall
because we know that the CPU one has bugs in it or cpu-bound we're just gonna
use the wall one and we see the top called there is a called a delegate
we're actually spending twenty-seven percent of our time in there so we're
calling Delia incoming delegate is slow so i win and i said okay we're gonna
stop doing that we won't call delegate anymore will just define these methods
directly right so changes changing code to find the methods directly and check
our progress we started out at 2.47 times slower so if we run that same
comparison benchmark again we'll see that were down to 1.4 times 1.45 times
slower sore 45% slower now
ok so if we we can verify that by running time and we'll see before it
took 26 seconds and after it took 16 seconds so the point here is that even
profilers have bugs and you should always always measure always measure and
verify your results so don't just take one profiling tools word for it because
you could be running into a bug just like i was doing next thing i want to
talk about a GC time in allocations so we're looking at before we are looking
at runtime performance specifically looking at
oh the speed of method calls and the other the other problem that we may have
is garbage collection time and if you look up there are no it's kind of small
but it's okay it'll say if you run these at home you'll see that it says GC there
and it tells you the amount of time and spending the GC so here we're spending
about five percent of our time in the GC and I want to talk about how you can
find allocations and how to reduce those allocations in your application and what
I use is a gem called allocation tracer it's written by poachy and
the way that you use it is just like this you say okay I'm gonna set up my
allocation this is my actual test and the reason like if you look at that top
line we're running the test once before we actually measure allocations and the
reason I'm doing that is because we have a bunch of caches we need to make sure
that those caches are heated up before we actually run the application test so
we say i want to know pass line and type run the test a bunch of times and then
that bit at the bottom there just prints out our top five allocations and this is
what our top five allocations look like the report here is it starts with the
files you can see the file that the allocation is happening in the line
where it's happening and the type of object that's being allocated and right
here is the that number that my arrow is not going to their ego is the total
number of allocations at that point so you can see at that point we allocated
about 57,000 strings so that place so we go to that very very simple you just go
look at that line to the this is what the line looks like so what this is
doing is inside of rails were quoting column names for sequel light so
whenever you say you know I want to insert a bunch of stuff into wherever
it's got to escape those column names and table names
this is the method that it uses to do that and it's generating it's actually
allocating a few strings every time those two string literals are an
allocation the g7 allocation the quote there that outer one is an allocation as
well now what's interesting is that inside your system the table names are
finite you have a finite number of tables in your database hopefully
if you have an infinite number of tables i'm sorry we don't support that column
column names are also finite which means that it's okay for us to introduce a
cash here
so what we did is just change this to cash those particular escape so we know
that this cash is going to grow to a particular level and then just stop
right so we'll say okay we'll calculate for your particular string will
calculate what the escape version is and then return that to you and if we rerun
the allocation test you'll notice with that thing is missing so we're getting a
bunch of stuff to allocate and rocky tails and blah blah blah other places
now we can focus on those other things but you'll see that one is gone so i'm
going to talk about memory savings while i'm at 30 minutes we will hurry
alright memory savings this is something I want to do in rails five and we
haven't we kind of started on it
what I want to do is I want to have some copy-on-write improvements i want to
make copy on write improvements the rails five and what I'm talking about
his temple compilation so all of your bees your Hamels or your Slim's or
whatever you use for templates today they're lazily allocated in the way that
works is we have to get a lock when a request comes in we get a lock and we
say hey is the temple compiled if the template is compiled then we unlock and
we return the compiled template to you if it's not compiled and we actually
have to compile the template then we save the template and then we unlock and
return it to you and what's interesting is that that save rights to memory now
if you think about this in terms of a threaded webserver forking webserver a
threaded little thread web server has to lock in those particular places we have
to actually lock in those places and a fork in web server waste memory because
you're BRB template is going to compile down to exactly the same thing across
every single process so what I want to do for rails fives I want to pre compile
these templates we should since these templates are static we know what their
data is we should be able to compile them in advance compile them before fork
in the master process and then for can have all of our children use those
afterwards and if we're using a threat and web server we can change it to a
read-only cash and we can eliminate that walk
so essentially what we would say is
to get rid of all that interior part that whole thing just goes away and it
happens on boot now
yes i said this remove locks internal web servers shared memory with the
parents and it should speed up our views drop memory usage and i wanted to talk
about well I don't remember if I put this in the slides are not well I want
to talk a little bit about how much memory we would save you can see with
BRB we can easily member i measure that with object space which I was showing
you earlier we can measure using mmmm sighs of all we can measure the amount
of memory that a particular type of object is taking in in MRI there's an
object called Ruby VM instruction sequence
those are the actual instruction sequences that your code gets compiled
down to so we do is we can measure how many instruction sequences have been
compiled before we evaluate the RB then we evaluate the RB and measure the size
afterwards and we know what size your ERP template takes so we see that one of
the one of the templates in our application took about 24 k we recently
just converted to handle our entire site so i just decided it's literally just a
one-to-one convergent this is the this is the calls the code was too small
that's the class we're measuring and I wanted to compare to handle just because
we really just did a one-to-one translation same code using handle arm
emphasize about 25k so not much difference
what's interesting is that our savings will be the compiled template size
divided by the number of workers that we have right that'll be our total memory
saving so it depends on how many templates you have how many workers you
use another interesting thing and possible downside is that say you have
like a million views right a million views but you only use one of those
views or one percent of those views we're gonna start compiling all those
views and it might be a waste of memory so what I want to do for rails five is
basically say okay you can often have all of them compiled or none of them
compile so we'll start out with that and then later on maybe we can introduce a
way where you can partially compile some of them so rails for to introduce
adequate records rails five i would like to introduce add confused
so i think i have over my time thank you very much

No comments:

Post a Comment