This is my documentation while going through the book Data Visualization, A practical introduction, by Kieran Healy:


Let’s first load all required libraries:


Now that our functions are loaded through their respective libraries, we can start lading some dat from the gapminder library:


Let’s make a quick plot:

p <- ggplot(data = gapminder,
            mapping = aes(x = gdpPercap, y = lifeExp))
p + geom_point()


The central activity of visualizing data with ggplot more or less always involves the same sequence of steps. There is some structured relationship, some mapping, between the variables in your data and their representation in the plot displayed on your screen or on the page. Ggplot provides you with a set of tools to map data to visual elements on your plot, to specify the kind of plot you want, and then subsquently to control the fine details of how it will be displayed.In ggplot, the logical connections between your data and the plot elements are called aesthetic mappings or just aesthetics.

-Tell the ggplot() function what your data is -Then how the variables in this data logically map onto the plot’s aesthetics -Take the result and say what general sort of plot you want. In ggplot, the overall type of plot is called a geom. Each geom has a function that creates it. -You combine these two pieces, the ggplot() object and the geom, by literally adding them together in an expression, using the “+” symbol.

So: 1. Data (what data we want to use) 2. Mapping (what relationships we want to see) 3. Geom (how we want to see the relationships in our data) 4. Coordinates and Scales (reference) 5. Labels and Guides (explanation)

p + geom_point() + geom_smooth()

p  + geom_smooth() + geom_point() # loads the points last, on top of the smooth line

p + geom_point() + geom_smooth(method = "lm")

It’s possible to give geoms separate instructions that they will follow instead, but in the absence of any other information, the geoms will look for the instructions it needs in the ggplot() function, or the object created by it.

An aesthetic mapping specifies that a variable will be expressed by one of the available visual elements, such as size, or color, or shape, and so on. Code does not give a direct instruction like “color the points purple”. Instead it says, “the property ‘color’ will represent the variable continent”, or “color will map continent”. The aes() function is for mappings only. Do not use it to change properties to a particular value. If we want to set a property, we do it in the geom_ we are using, and outside the mapping = aes(…) step.

The various geom_ functions can take many other arguments that will affect how the plot looks, but that do not involve mapping variables to aesthetic elements.

Now a more polished view:

p <- ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, colour = continent, fill = continent))
p + geom_point(alpha = 0.3) +
  geom_smooth(method = "gam") +
  scale_x_log10(labels = scales::dollar) +
  labs(x = "GDP per Capita", y = "Life Expectancy in Years",
       title = "Economic Growth and Life Expectancy, by Continent",
       subtitle = "Data points are country and years",
       caption = "Source: Gapminder")