Improving Visualizations: Mapping Data to a Dynamic Pitch with ggplot2 in R

Improving Visualizations: Mapping Data to a Dynamic Pitch with ggplot2 in R
This guide will be very beginner friendly, but it does assume you have atleast some working knowledge of R. If you have installed rStudio and know how to install packages you're ready to get started. If not feel free to reach out, and I can get you some resources to get started working with your data in R.

Create a Basic Scatter Plot

Let's begin by loading the tidyverse library. We'll be using ggplot2 to create the plot and add the pitch lines later. Second I like to store any static variables at the top, so they're easy to adjust in a hurry. You'll see me use it to simply set up custom font styles and various elements colors, so I can make a couple changes at the top of the file to make sweeping changes to anything style related.
main.r
library(tidyverse)

#Chart style variables
pointColor <- "#059669"
Now onto the more important things. If you're already pulling in actual data and know how to create a scatter plot, you can skip these steps, but for now I'll set the pitchWidth and pitchLength variables manually. I'm using 80 x 120 for the pitch dimensions since a. it coerlates well to actual pitch sizes and b. it aligns well with our in-house data. Important to note, reading how your data source handles pitch coordinates now can allow you to decide whether to adapt the plot to your data or adapt your data to this plot. I personally choose to adapt any location data I'm working with to fit this plot because I'm huge proponent of using vertical pitches with players. If you're working with coaches or other analysts, you may opt to orient the plot from left to right.
main.r
#pitch size variables
pitchWidth <- 80
pitchLength <- 120
Now let's generate some random data for the plot. I'm using the sample function to create 100 random values. Let's take a look at what the sample function is doing below: xSample <- sample ( {range between 0 and the pitchWidth variable}, {size = 100 iterations}, {replace = TRUE})xSample is being set 100 times in the range of 0 and the width of the pitch, and then we're replacing the value, so it can possibly be selected again. ySample is the same except we use the length of the pitch in the range. Finally we combine our values into a R dataframe (table) and store it as a variable called randomData.
main.r
# Generate 100 random values for x and y within the bounds of the pitch
xSample <- sample(0:pitchWidth, 100, replace = TRUE)
ySample <- sample(0:pitchLength, 100, replace = TRUE)

# Combine x and y into a dataframe
randomData <- data.frame(xData = xSample, yData = ySample)
Now the only thing left to do for the first chart is to create the plot. We're going to call the ggplot function from the tidyverse package and wrap our random data in it.
main.r
ggplot(randomData) +
  geom_point(aes(xData,yData),size=3, alpha=0.5, color=pointColor) 
Now that we have our data mapped it's time to add our layer, if this is your first time working with this package, I would highly suggest learning more about ggplot2. The basics is that ggplot is a system for declaratively creating graphics. After you call the function ggplot(data), you then add layers to map that data onto the plot using the + symbol. In this case we're making a simple scatter plot, so we need to add the data points to the plot.+ geom_point() Inside the geom_point function, we need to provide a couple of aesthetic mappings, both the values for x and y. We want x to equal the xData variable and y to equal the yData variable from our dataframe, so we'll set them inside our geom_point layer. geom_point(aes(xData,yData))That's everything needed to create the initial plot, but we'll add in a couple properties, so it's easier to understand what we're doing in the next steps. With geom_point(aes(xData,yData), size=3, alpha=0.5, color=PointColor) we're now setting the size to 3px, the alpha to 50% and the color to our static variable.

Creating a Plot with Pitch Lines

Now let's jump straight into creating the same data in a plot that provides better context. First let's set some static variables to style our chart. For this I'm going to add the showtext library and a google font for better customization of the title. I've also set some additional color and size variables that can easily be adjusted based on each case.
mainPitchLines.r
library(tidyverse)
library(showtext)

#Chart style variables
font_add_google("Source Sans 3")  #add, load and set font from google font
showtext_auto()   
fontFamily <- "Source Sans 3" 

fontColor <- "#F3F4F6"            #set font color
backgroundColor <-"#111827"       #set plot background color
insideLines <- "#9CA3AF"          #set inside line color
outsideLines <- "#F3F4F6"         #set outside line color
lineSize <- 0.6                   #set width of lines
pointColor <- "#059669"           #set point color
Next up we're creating our pitch dimensions, adding a chart title, and sampling some random data to plot
mainPitchLines.r
#Bring in data and manipulate it here

#Create a chart title and set pitch dimensions (Should be done dynamically based on your data)
chartTitle <- "Plot of Random Events"
pitchWidth <- 80
pitchLength <- 120

# Generate 100 random values for x and y within the bounds of the pitch
xSample <- sample(0:pitchWidth, 100, replace = TRUE)
ySample <- sample(0:pitchLength, 100, replace = TRUE)

# Combine x and y into a dataframe
randomData <- data.frame(xData = xSample, yData = ySample)
Then we're going to call the ggplot function and put our data in again. Instead of immediately calling geom_point, we're going to draw the pitch first. Because of the way ggplot works, you have add the bottom layers first and work up. If we called geom_point first our scatter plot would appear underneath the pitch lines. This also means we'll be working inside out because I want the outside of the pitch to be on top since we have it setup to have a higher contrast to the background and won't be overlaying much data on them. There are 15 layers in total to draw, but don't worry, I'll be break each one down so you can understand the logic and how to manipulate annotations for other things.
mainPitchLines.r
ggplot(randomData) +

Annotating Pitch Lines

The first thing we'll do is draw our halfway line. Since the logic is simple, I'll take a little bit more time to explain how annotate() works. Let's start by having a look at the documentation for annotate in R. A simple trick is to add a question mark before any function and run it in the console ?annotate This will bring up the usage, arguments, and examples for any function.
Usage

annotate(
  geom,
  x = NULL,
  y = NULL,
  xmin = NULL,
  xmax = NULL,
  ymin = NULL,
  ymax = NULL,
  xend = NULL,
  yend = NULL,
  ...,
  na.rm = FALSE
)
You'll notice that annotate takes a geom argument first, then some location information, followed by other arguments that you can pass to the function by name. Since we won't need to every argument, I'm going to explicitly name the ones we need to set. First I'm going to pass in "segment" for the geom since we want to make a line. x & y represents the start location for the segment, so I prefer to simplify things and deal with them independantly when possible. Since we're drawing the line horizontally across the pitch the start of the line will be x=0 and the end would be the width of the pitch soxend = pitchWidth Since we're looking for the midpoint y = (pitchLength/2) & yend = (pitchLength/2) Next I'll simply set the color and size variables equal to their respective variables insideLines and lineSize I'll then close up the annotate function and send the code to the next function with + If you want test as you go, simply remove the plus symbol. At this stage you should have a blank plot with a horizontal line through the middle.
mainPitchLines.r
  #Annotate halfway Line
  annotate("segment",
           x = 0, xend = pitchWidth,
           y = (pitchLength/2), yend = (pitchLength/2),
           color = insideLines, size = lineSize
           ) +
Next up we're going to create our center circle circle dot by passing "point" into the geom. We want it in the midpoint of both the width and length of the pitch so we set x = (pitchWidth/2) , y = (pitchLength/2) We set the color to inside color variable and the size to be 3 times the size of our lines size = (lineSize*3) and our second layer is done. Close it up and send it to the next annotate function with the + operator
mainPitchLines.r
#Annotate center dot
annotate("point",
           x = (pitchWidth/2) , y = (pitchLength/2),
           color = insideLines, size = (lineSize*3)
           ) +
The next annotation we'll draw will require a bit of math. We're going to dynamically create the center circle, and yes there's an easier way to do this, but this is the best way to understand the "path" geom and will be neccessary to create the penalty arcs later on. Step 1 is to find the center point of the circle we're going to create. In this case it's in the dead center of the pitch so x=(pitchWidth/2) & y=(pitchLength/2) Next we'll need to make a cicular path. For this we'll be using the seq function inside of R. Remember you can learn more about the function with ?seq but let's decide what approach to use, so we know what we need from our seq function. For this path, we're going to be using Sine and Cosine to calculate the values of right triangles inside our circle in order to dynammically calculate the x and y coordinates for each point. Since we know our hypotenuse will be the radius of the circle the only variable we'll need to know is the θ which is in Radians. Have a look at the videos to get a visual understanding of how we're going to build our circle and any arc shaped paths in the future.

sin(θ)

cos(θ)

Now that we know we need to sequence from 0 to 2pi, we can specificy that in our seq function. seq(from = 0, to = 2*pi, length.out=9) where length.out is the number of iterations. This will give us an octogon, but we'll increase the number of iterations to 2000 for a smooth arc in the final code. Next we just need to run the cosine function to find the length of the adjacent side over our sequenced theta values, so we wrap the entire seq with cos() and multiply by the radius we want for our circle which is 10 yards in this case 10 * cos(seq(0,2*pi,length.out=9)) Since x is the adjacent side we just add the calculations to our horizontal starting point. x=(pitchWidth/2)+10*cos(seq(0,2*pi,length.out=9)) For y, we just need to solve with sin and center it vertically y=(pitchLength/2)+10*sin(seq(0,2*pi,length.out=9)) and we've got an octogon in the center of our pitch. Just increase the length.out on both funcionts to 2000 and set the color and size variables and we'll have a pixel perfect center circle.
mainPitchLines.r
#Annotate center circle
  annotate("path",
           x=(pitchWidth/2)+10*cos(seq(0,2*pi,length.out=2000)),
           y=(pitchLength/2)+10*sin(seq(0,2*pi,length.out=2000)),
           color = insideLines, size = lineSize
           ) +
Next we'll be adding a couple rectangles to act as the goal areas. We'll pass the "rect" as the geom and we'll start by finding the center point with pitchWidth/2 then we'll subtract 10 from it to give use the xmin and add 10 to give use the xmax this will give use box that is in the middle of the field and 20 yards wide. For the goal area at the bottom of the pitch simply setting the ymin = 0 and the ymax=6 will create the our 6 yard box. For the goal area at the top of the pitch we want the ymax = pitchLength and then subtract 6 from it to create the ymin Then just set the color and size to your variable, and lastly we'll need to specify what we want done with the fill. Since we don't want any we'll set fill = NA and our goal areas are complete.
mainPitchLines.r
#Annotate goal areas
  annotate("rect",
           xmin = (pitchWidth/2)-10, xmax = (pitchWidth/2)+10,
           ymin = 0, ymax = 6,
           fill = NA, color = insideLines, size = lineSize
           ) +
  annotate("rect",
           xmin = (pitchWidth/2)-10, xmax = (pitchWidth/2)+10,
           ymin = (pitchLength-6), ymax = pitchLength,
           fill = NA, color = insideLines, size = lineSize) +
At this point we've addressed every geom type, and I would reccommend trying to build them out on your own, but I'll still go through them below so you can see how I approached them.
For the penalty areas we'll repeat the same process as before. Since the box is 44 x 18 you'll want your xmin = (pitchWidth/2)-22 & xmax = (pitchWidth/2)+22 Next you'll set you set 0 and 18 for the ymin/max and the second one will be ymin = (pitchLength-18) & ymax = pitchLength Set the fill = NA and set your style variables.
mainPitchLines.r
#Annotate penalty area
  annotate("rect",
           xmin = (pitchWidth/2)-22, xmax = (pitchWidth/2)+22,
           ymin = 0, ymax = 18,
           fill = NA, color = insideLines, size = lineSize) +
  annotate("rect",
           xmin = (pitchWidth/2)-22, xmax = (pitchWidth/2)+22,
           ymin = (pitchLength-18), ymax = pitchLength,
           fill = NA, color = insideLines, size = lineSize
           ) +
For our penalty spots we're putting them in the center of the field with x = (pitchWidth/2) and setting the y's 12 yards out with y = 12 & y = (pitchLength-12) we'll set the size to three times our lineSize and the color to match the rest of our lines so far.
mainPitchLines.r
#Annotate penalty spots
  annotate("point",
           x = (pitchWidth/2), y = 12,
           color = insideLines, size = (lineSize*3)
           ) +
  annotate("point",
           x = (pitchWidth/2), y = (pitchLength-12),
           color = insideLines, size = (lineSize*3)
           ) +
Next up we'll be using another two path geoms to build out penalty area arcs. Penalty arcs are just 10 yards circles around the penalty spot, but there arc is only painted outside the penalty area. This means there's only a couple minor tweaks we need to make from our center circle formulas. We can use the inverse of sine to find out the exact starting and ending points for our arc since we know the hypnotenuse is 10 yards and the opposite side is 6 yards. This gives us 0.2pi and end at 0.8pi for the line at the bottom of the pitch and, you guessed it, 1.2pi and 1.8pi for the top since we're calcuting the opposite side of the circle. We then just wrap our sequences and add them to their starting points. I've adjusted the starting and end points by 0.05 to account for any overlap caused by line size. Set the size and color to your variables and it's time to add the outside lines onto our field.
mainPitchLines.r
  #Annotate Penalty Arcs
  annotate("path",
           x=(pitchWidth/2)+10*cos(seq(0.205*pi,0.795*pi,length.out=600)),
           y=12+10*sin(seq(0.205*pi,0.795*pi,length.out=600)),
           size = lineSize, color=insideLines
           ) +
  annotate("path",
           x=(pitchWidth/2)+10*cos(seq(1.205*pi,1.795*pi,length.out=600)),
           y=(pitchLength-12)+10*sin(seq(1.205*pi,1.795*pi,length.out=600)),
           size = lineSize, color=insideLines
           ) +
Next we'll add we'll add another rectangle for our touchlines and by-lines. starting at 0 and going to the width and height. Make sure your fill = NA and this is we're i'll introduce a higher contrast color with color=outsideLines
mainPitchLines.r
  #Annotate Pitch Border
  annotate("rect",
           xmin=0, xmax = pitchWidth,
           ymin = 0, ymax = pitchLength,
           fill = NA, color = outsideLines,
           size = lineSize
           )+
Next up, I'll create some rectangle to represent the goals. Goals are 8 yards wide, so we'll use (pitchWidth/2)±4 for our xmin/max, we'll set them half a yard off the field with 0 to -0.5 and pitchLength to pitchLength+0.5 For these rectangles I'll set the fill & color equal to our outsideLines variable and set the size.
mainPitchLines.r
  #Annotate Goals
  annotate("rect",
           xmin = (pitchWidth/2)-4 , xmax = (pitchWidth/2)+4,
           ymin =0, ymax = -0.5,
           fill = outsideLines, color = outsideLines, size = lineSize
           ) +
  annotate("rect",
           xmin = (pitchWidth/2)-4 , xmax = (pitchWidth/2)+4,
           ymin =pitchLength, ymax = (pitchLength+0.5),
           fill = outsideLines, color = outsideLines, size = lineSize
           ) +
We're now ready to start adding our data on top of the pitch. We'll simply drop our random scatter plot back in for this guide, and then I'll set up some make some adjustments to the theme elements, so the final results ends up clean and production ready.
mainPitchLines.r
  #Create scatter points for data
  geom_point(aes(x,y),size=3, alpha=0.5, color=pointColor) +

  #Customize the theme elements that interfere with the pitch  
  theme( 
    rect = element_blank(), #Remove inner background
        line = element_blank(), #Remove grid lines
        axis.title = element_blank(), #Remove axis titles
        axis.text = element_blank(), #Remove axis values
        plot.background = element_rect(
          fill = backgroundColor, color = backgroundColor
          ), #Set background equal to style variable
        plot.title = element_text(
          face="bold",          #set title font face
          size = 32,            #set title size
          family=fontFamily,    #set title font family
          color = fontColor,    #set title color
          hjust = .5,           #set horizontal position to center
          vjust=0               #set vertical position to top
        ), 
  ) +
  
  labs(title=chartTitle)        #Adds the title based on our variable
That's it, you should now have fully dynamic pitch lines annotated on your plot, ready to add on your clusters, heatmaps, passing networks, possession chains or any other location based events data you need to visualize.
Key an eye out for more guides on displaying football (soccer) data with ggplot2 and R.

Author

Chris Lawson

Soccer Coach and Analyst

Soccer Coach and Analyst

Maintaining a collaborative blog to aid coaches, analysts and ultimately players in how to measure and improve performance. Specializes in the application of technology in high-performance environments including data and video analysis and will contribute articles and guides in these areas.

Other Projects

Leave a Reply: