Link Search Menu Expand Document

Styling Line Graphs

There are several ways of styling line graphs. The following examples demonstrate how to modify the appearances of the lines (type and sizes), as well chart titles and axes labels.

Keep in Mind

  • To get started on how to plot line graphs, see here.
  • Elements for customization include line thickness, line type (solid, dashed, etc.), shade, transparency, and color.
  • Color is one of the easiest ways to distinguish a large number of line graphs. If you have many line graphs overlaid and have to use black-and-white, consider different shades of black/gray.

Implementation

Python

import pandas as pd
import seaborn.objects as so
import numpy as np
import matplotlib.pyplot as plt
from seaborn import axes_style




# Download the economics dataset (from ggplot2 so comparison is apples-to-apples)
url = "https://raw.githubusercontent.com/tidyverse/ggplot2/main/data-raw/economics.csv"
economics = pd.read_csv(url)


# Quick manipulation of dataframe to convert column to datetime
df = (
    economics
    .assign(
        date = lambda df: pd.to_datetime(df['date'])
        )
)



# Default plots (Notice the xaxis only has 2 years! We'll fix this in p2)
p1 = (
    so.Plot(data=df, x='date', y='uempmed')
    .add(so.Line())
    )
p1




## Change line color and chart labels, and fix xaxis
## Note here that color is inside of the Line call, so this would color the line.
## If color were instead *inside* the so.Plot() object, SO would assign it
## a different line for each value of the factor variable (column), colored differently. (Commonly referred to as hue in seaborn)
# However, in our case, we can pass a color directly.
p2 = (
    so.Plot(data=df, x='date', y='uempmed')
    .add(so.Line(color='purple'))
    .label(title='Median Duration of Unemploymeny', x='Date', y='')
    .scale(x=so.Temporal().tick(upto=10)) #Needed for current configuration of seaborn.objects so xaxis prints more than 2 ticks
    .theme(axes_style("whitegrid")) #use a function from parent seaborn library, that will pass a prebuilt selection based on what you pass
    )

p2



## plotting multiple charts (of different line types and sizes)
p3 = (
    so.Plot(data=df)
    .add(so.Line(color='darkblue', linewidth=5), x='date', y='uempmed')
    .add(so.Line(color='red', linewidth=2, linestyle='dotted'), x='date', y='psavert')
    .label(title='Unemployment Duration (Blue)\n & Savings Rate (Red)', 
           x='Date', 
           y='')
    .scale(x=so.Temporal().tick(upto=10)) #Needed for current configuration of seaborn.objects so xaxis prints more than 2 ticks
    .theme(axes_style("whitegrid")) #use a function from parent seaborn library, that will pass a prebuilt selection based on what you pass
    )

p3


## Plotting a different line type for each group
## There isn't a natural factor in this data so let's just duplicate the data and make one up
df['fac'] = 1
df2 = df.copy()
df2['fac'] = 2
df2['uempmed'] = df2['uempmed'] - 2 + np.random.normal(size=len(df2))
df_final = pd.concat([df, df2], ignore_index=True).astype({'fac':'category'})


p4 = (
    so.Plot(data=df_final, x='date', y='uempmed', color='fac')
    .add(so.Line())
    .label(title = "Median Duration of Unemployment",
           x = "Date",
           y = "",
           color='Random Factor')
    .scale(x=so.Temporal().tick(upto=10)) #Needed for current configuration of seaborn.objects so xaxis prints more than 2 ticks
    .theme(axes_style("whitegrid")) #use a function from parent seaborn library, that will pass a prebuilt selection based on what you pass
)

p4



# Plot all 4 plots
fig, axs = plt.subplots(2, 2, figsize=(10, 8))
# Draw each plot in the corresponding subplot
p1.on(axs[0, 0]).plot()
p2.on(axs[0, 1]).plot()
p3.on(axs[1, 0]).plot()
p4.on(axs[1, 1]).plot()

# Adjust layout to avoid overlap
plt.tight_layout()

# Show the combined plot
plt.show()

The four plots generated by the code are (in order p1, p2, then p3 and p4):

Four Styled Line Graphs in Python

R

## If necessary
## install.packages(c('ggplot2','cowplot'))
## load packages
library(ggplot2)
## Cowplot is just to join together the four graphs at the end
library(cowplot)

## load data (the Economics dataset comes with ggplot2)
eco_df <- economics

## basic plot
p1 <- ggplot() +
  geom_line(aes(x=date, y = uempmed), data = eco_df)
p1

## Change line color and chart labels
## Note here that color is *outside* of the aes() argument, and so this will color the line
## If color were instead *inside* aes() and set to a factor variable, ggplot would create
## a different line for each value of the factor variable, colored differently.
p2 <- ggplot() +
  ## choose a color of preference
  geom_line(aes(x=date, y = uempmed), color = "navyblue", data = eco_df) +
  ## add chart title and change axes labels
  labs(
    title = "Median Duration of Unemployment",
    x = "Date",
    y = "") +
  ## Add a ggplot theme
  theme_light()
  ## center the chart title
  theme(plot.title = element_text(hjust = 0.5)) +

p2

## plotting multiple charts (of different line types and sizes)
p3 <-ggplot() +
  geom_line(aes(x=date, y = uempmed), color = "navyblue",
            size = 1.5, data = eco_df) +
  geom_line(aes(x=date, y = psavert), color = "red2",
            linetype = "dotted", size = 0.8, data = eco_df) +
  labs(
    title = "Unemployment Duration (Blue) and Savings Rate (Red)",
    x = "Date",
    y = "") +
  theme_light() +
  theme(plot.title = element_text(hjust = 0.5))

p3

## Plotting a different line type for each group
## There isn't a natural factor in this data so let's just duplicate the data and make one up
eco_df$fac <- factor(1, levels = c(1,2))
eco_df2 <- eco_df
eco_df2$fac <- 2
eco_df2$uempmed <- eco_df2$uempmed - 2 + rnorm(nrow(eco_df2))
eco_df <- rbind(eco_df, eco_df2)
p4 <- ggplot() +
  ## This time, color goes inside aes
  geom_line(aes(x=date, y = uempmed, color = fac), data = eco_df) +
  ## add chart title and change axes labels
  labs(
    title = "Median Duration of Unemployment",
    x = "Date",
    y = "") +
  ## Add a ggplot theme
  theme_light() +
  ## center the chart title
  theme(plot.title = element_text(hjust = 0.5),
        ## Move the legend onto some blank space on the diagram
        legend.position = c(.25,.8),
        ## And put a box around it
        legend.background = element_rect(color="black")) +
  ## Retitle the legend that pops up to explain the discrete (factor) difference in colors
  ## (note if we just want a name change we could do guides(color = guide_legend(title = 'Random Factor')) instead)
  scale_color_manual(name = "Random Factor",
                     # And specify the colors for the factor levels (1 and 2) by hand if we like
                     values = c("1" = "red", "2" = "blue"))
p4

# Put them all together with cowplot for LOST upload
plot_grid(p1,p2,p3,p4, nrow=2)

The four plots generated by the code are (in order p1, p2, then p3 and p4):

Four Styled Line Graphs in R

Stata

In Stata, one can create plot lines using the command line, which in combination with twoway allows you to modify components of sub-plots individually. In this demonstration, I will use minimal formatting, but will apply minimal modifications using Ben Jann’s grstyle.

** Setup: Install grstyle
ssc install grstyle
grstyle init
grstyle color background white
grstyle set legend, nobox

Setup

First, you need to load the data into Stata. The data is a copy from the data economics available within ggplot package, and translated using foreign.

use https://friosavila.github.io/playingwithstata/rnd_dta/economics, clear
** Since this was taken directly from R, the date variable will not be formatted.
** We can format the date using the following.
format date %tdCCYY
** This indicates to create a _mask_, to put on top of "data"
** but only display the "year"

Simple line plot

Now, For a simple plot, we could use the following syntax:

line yvar1 [yvar2 yvar3 ...] xvar1

This requests plotting all variables yvarX against xvar1 (horizontal axis). Internally, the command connects every pair of data [yvar1,xvar1] sequentially, based on the order they appear in the dataset. Below, we can do that, plotting unemployment duration uempmed vs date.

line uempmed date

Simple Line Graph in Stata

Something to keep in mind. If the dataset is not sorted by date, you may end up with a lineplot that is all over the place. For example:

sort uempmed
line uempmed date

Unsorted line graph in Stata

To avoid this, it is recommended to use the option sort.

line uempmed date, sort

Simple Line Graph in Stata

Adding titles, and axis titles

The next thing you may want to do is add information to the plot, so its easier to understand what the figure is showing. Specifically, we can add information on the vertical axis using ytitle(). I will also use xtitle() to drop the horizontal axis information, and add a title title().

line  uempmed date, sort ///
      ytitle("# of weeks") xtitle("") ///
      title(Unemployment Duration)

Stata line graph with title

Changing Line characteristics.

It is also possible to modify the line width lwidth(), line color lcolor(), and line pattern lpattern(). To show how this can affect the plot, below 4 examples are provided.

Notice that each plot is saved in memory using name(), and all are combined using graph combine.

line uempmed date, sort ///
ytitle("# of weeks") xtitle("") ///
title(Unemployment Duration 1) ///
lwidth(.5) lcolor(red) lpattern(solid) name(m1,replace)

line uempmed date, sort ///
ytitle("# of weeks") xtitle("") ///
title(Unemployment Duration 2) ///
lwidth(.25) lcolor(gold) lpattern(dash) name(m2,replace)

line uempmed date, sort ///
ytitle("# of weeks") xtitle("") ///
title(Unemployment Duration 3) ///
lwidth(1) lcolor("68 170 153") lpattern(dot) name(m3,replace)

line uempmed date, sort ///
ytitle("# of weeks") xtitle("") ///
title(Unemployment Duration 4) ///
lwidth(.5) lcolor(navy%50) lpattern(dash_dot) name(m4,replace)

graph combine m1 m2 m3 m4

Four styled line graphs in Stata

Ploting Multiple Lines, and different axis

You may also want to plot multiple variables in the same figure. There are two ways to do this:

twoway  (line uempmed date, sort lwidth(.75) lpattern(solid) ) ///
        (line psavert date, sort lwidth(.25) lpattern(dash) ), ///
        legend (order(1 "Unemployment duration" 2 "Saving rate")) 
  
line  uempmed psavert  date, sort lwidth(0.75 .25) lpattern(solid dash) ///
      legend(order(1 "Unemployment duration" 2 "Saving rate")) 

Both options provide the same figure, however, I prefer the first option since that allows for more flexibility.

Stata line graph with multiple lines

You can also choose to plot each variable in a different axis. Each axis can have its own title.

twoway  (line uempmed date, sort lwidth(.75) lpattern(solid) yaxis(1)) ///
        (line psavert date, sort lwidth(.25) lpattern(dash) yaxis(2)), ///
    legend(order(1 "Unemployment duration" 2 "Saving rate")) ///
    ytitle(Weeks ,axis(1) ) ytitle(Interest rate,axis(2) ) 

Stata line graph with multiple axes

Adding informative Vertical lines.

Finally, it is possible to add vertical lines. This may be useful, for example, to differentiate the great recession period. Additionally, in this plot, I add a note.

twoway  (line uempmed date, sort lwidth(.75) lpattern(solid) yaxis(1)) ///
        (line psavert date, sort lwidth(.25) lpattern(dash) yaxis(2)), ///
        legend(order(1 "Unemployment duration" 2 "Saving rate")) ///
        ytitle(Weeks ,axis(1) ) ytitle(Interest rate,axis(2) )  ///
        xline(`=td(1dec2007)'/`=td(30jun2008)', lcolor(gs8)) ///
        note("Note:Grey Area marks the Great recession Period") ///
        title("Unemployement Duration and" "Saving Rate")

Stata line graph with vertical line marker