If you've been busy at the multiplex watching Marvel Studios' "The Avengers" assemble, you may have missed some staggering data visualizations that brilliantly deconstruct Earth's Mightiest Heroes' entire 570-issue run.
The visualizations were created by Jer Thorp, currently the New York Times' Data Artist-in-residence, who also happens to be a long-time fan of The Avengers. Analyzing data from Comic Vine API, Thorp produced striking visuals that reveal some interesting things about the super-team's past. For example, did you know the Avengers had the most females on the team from 1983 to '84? And did you know the issue to feature the most characters was 1998's "Pomp and Pageantry? (119 Avengers appeared!)
Thorp spoke with Comic Book Resources about the work and drive behind turning 50 years of dusty back issues into new data visualizations that reveal the underlying patterns of the Avengers.
CBR News: As a longtime fan of The Avengers, what were a few things you discovered about the team's history during the project that surprised you?
Jer Thorp: In a lot of ways, this was a chance to re-read (albeit in a different way) a set of stories and characters that were really integral to my childhood.
I started with the Avengers early. I was that not-so-cool kid who had "The West Coast Avengers” in his box. I've probably read most of the main series and the entire WCA run twice -- once as a kid, and once when I dragged my long boxes out of storage in university. Both times, my reading of the series was really informed by my life at the time, and this third machine-reading was a chance to revisit all of those dog-eared issues in as close to a subjective of a way as possible.
Character-wise, I was surprised by just how many Avengers there were -- I'd forgotten about most of them -- and some of them I had to re-read issues to confirm they actually existed. Triathalon? The Whizzer? Really?
I think, though, the biggest thing I got out of this exercise was a better understanding of the system of people behind the art and stories. Of course names like Mark Greunwald and Jim Shooter were familiar to me, having seen them in the masthead, but now I have a better idea of how they all fit together. Likewise, it was really interesting to learn something about some of the early creators, who I didn't know much or anything about before.
Can you tell us a bit about the process of putting this all together? How long did it all take?
I worked on these visualizations in the evenings over a course of about three or four days. With any of these data-based projects, I usually have an idea of what I'm going to do or look at, and that typically gets railroaded right away. In the beginning of this project I was really focused on images (the Comic Vine API returns images of all characters, issues & locations) but they turned out to be a lot less interesting than I thought -- mainly because the wiki is populated with modern images of most characters, rather than vintage ones. It became clear fairly quickly that looking at patterns in character appearances and creator credits might be fruitful, so that's the angle that I ended up pursuing.
All of the visualizations are hand-built in Processing. I spent he first day or so putting together the wiring to get connected to the Comic Vine API and to get everything pulled down. It also took a little bit of time to extract unique colors for each character. I wanted to automate this process, but in the end I ended writing a little tool that let me pick colors from images by hand. Once I had all of the API data about every issue and character, along with the color data, it was a matter of making a whole bunch of visualizations and deciding what I thought was the most interesting (over the course of this project I probably made about 300 images, most of which were abject failures).
I'd like to do something with the location data (the API stores locations from every issue), though it's kind of tricky to find valid latitude/longitude values for the Kree homeworld, or the blue area of the moon.
Which visualization was the most challenging to put together? And was there one you were hoping to do that you couldn't make work?
Most of the visualizations are variations on two themes -- appearances of characters, charts of creator credits. As soon as these foundations were built, it was pretty easy to knock out versions. I look at most of these graphics (or all of them) as sketches. I'm not sure I'm particularly happy with how any of them turned out, but it is a fun experience to dig through the data.
The Comic Vine API has room for concepts to be added to issues, but this is fairly sparsely populated. I'd love to see how narrative themes have changed over the Avengers' run, and to again see if certain editors or writers bring characteristic themes with them during their work on the series.
Does being a comic book fan influence the way you approach data visualization at all? It seems like there might be some overlap there.
Good question. I'd like to think it helps with things like choice of color, building narrative, etc. Certainly, any comic book fan has a built in visual and compositional sense that probably pays off in some way.
Obviously, with this project, it was my inner comic nerd that gave me the curiosity to dig as deep as I did. It also revived my interest in the series -- I've been going through some back issues over the last couple of days and paying a lot more attention to the underlying structure of things.