Summarizing by Group
Use summarize()
to Make Summary Tables
The summarize() function will make a new data table based on how you want to quantify your data. A very typical example is to use it with basic statistical functions like mean()
and sd()
.
Here, we’ll add a new function n()
and use summarize to understand more about treat price.
Combine with group_by()
for More Power
summarize()
is often used in conjunction with group_by()
in order to get more refined descriptions of the data.
group_by()
alone doesn’t really look like it does anything, but it is adding hidden structure that allows summarize()
to understand distinct observations in each column.
We can use these to find the mean yumminess for each flavor to see what kind of treat Charlie likes best.
We can group_by()
multiple columns at once and get the unique pairs (or triplets, etc) for those columns. If we want to learn what flavor treats last the longest we can group by flavor and long_lasting and summarize using the n()
function to see how many of each flavor get are long lasting or not.
Using group_by()
and summarize()
together gives us a lot of powerful options with just a little bit of code. And it is much faster than doing the equivalent procedure than in Excel or Google Sheets.