Visualizing iBiblio.org traffic
March 5, 2010 3:11 PM Subscribe
Jeff Heard, from the Renaissance Computing Institute (a joint project between the University of North Carolina, Duke University, and North Carolina State University, among others), posts gorgeous visualizations of internet traffic to projects hosted by iBiblio.org.
From Jeff's post:
"I broke all these addresses up into subnets and assigned a unique position in a sphere to each subnet. Longitude is the first octet, mapped from 0,255 to (-180,180) Latitude is the second octet, mapped from 0,255 to (-90,90), and the radius out from the center is the third octet. The subnets are visualized by a PovRay blob component or a sphere in the case of more than 10% of active internet subnets have hit the site. The radius of the sphere or the strength of the static field behind the blob is the number of IP addresses within the subnet that have touched the site."
From Jeff's post:
"I broke all these addresses up into subnets and assigned a unique position in a sphere to each subnet. Longitude is the first octet, mapped from 0,255 to (-180,180) Latitude is the second octet, mapped from 0,255 to (-90,90), and the radius out from the center is the third octet. The subnets are visualized by a PovRay blob component or a sphere in the case of more than 10% of active internet subnets have hit the site. The radius of the sphere or the strength of the static field behind the blob is the number of IP addresses within the subnet that have touched the site."
What he said.
Pretty stuff, too. (and the amount of data behind it ... I can see why you'd want a graphic representation.)
posted by jlkr at 4:23 PM on March 5, 2010
Pretty stuff, too. (and the amount of data behind it ... I can see why you'd want a graphic representation.)
posted by jlkr at 4:23 PM on March 5, 2010
Why are all the first octets represented but not all the second ones? Wouldn't that have to mean that all the xxx.250.yyy.zzz (for instance) addresses were staying away? What would cause that?
posted by DU at 6:57 PM on March 5, 2010
posted by DU at 6:57 PM on March 5, 2010
What he's really plotting is a series of /24 "subnets". AAA.BBB.CCC.0/24, where AAA is mapped onto longitude, BBB is latitude, and CCC is distance from center. The number of hosts from that net connection determines the size of the dot plotted. The trick -- he scales the 0-254 of each element to the range he needs to map -- over 360 degrees for AAA, over 180 degrees for BBB. CCC, as an arbitrary distance, is simply scaled. The number of hosts in AAA.BBB.CCC.0/24 connecting determines the size of the dot plotted.
Jeff is using subnet incorrectly in the internet sense, where subnets are variably sized. So, in that AAA.BBB.CCC.0/24 net he is plotting, you may have a fraction of a net (if it's part of a /20, say) or several nets (a collection of /27s, for example.) But discovering all of those, while not impossible, would be time consuming -- though if you plotted the nets as they were, with, say, color representing size of the network space and size representing percentages of that netspace connection, it could be interesting.
But I defer to Jeff in matters of visualization. He's really good at this -- far better than I truly understand. I get what he does, I have no *idea* how he comes up with them. This mapping, for example, is trivial for me to understand -- and I'd never have come up with it.
posted by eriko at 8:35 PM on March 5, 2010
Jeff is using subnet incorrectly in the internet sense, where subnets are variably sized. So, in that AAA.BBB.CCC.0/24 net he is plotting, you may have a fraction of a net (if it's part of a /20, say) or several nets (a collection of /27s, for example.) But discovering all of those, while not impossible, would be time consuming -- though if you plotted the nets as they were, with, say, color representing size of the network space and size representing percentages of that netspace connection, it could be interesting.
But I defer to Jeff in matters of visualization. He's really good at this -- far better than I truly understand. I get what he does, I have no *idea* how he comes up with them. This mapping, for example, is trivial for me to understand -- and I'd never have come up with it.
posted by eriko at 8:35 PM on March 5, 2010
Right, I get that. If he were really plotting by subnet, it would make sense that a latitude range is missing. That subnet might not be connected to the internet or might not even be assigned to any computers. But it seems weird to me that a range of BBBs are didn't hit him from any AAAs.
posted by DU at 4:32 AM on March 6, 2010
posted by DU at 4:32 AM on March 6, 2010
But it seems weird to me that a range of BBBs are didn't hit him from any AAAs.
Actually, that makes perfect sense (and arguably, he should have scaled around it.) A lot of the AAA will never hit. 10.0.0.0/8 will only have address from local machines, 234.0.0.0/5, 236.0.0.0/6 and 240.0.0.0/3 should never hit at all, as they're multicast and reserved.
posted by eriko at 8:36 AM on March 6, 2010
Actually, that makes perfect sense (and arguably, he should have scaled around it.) A lot of the AAA will never hit. 10.0.0.0/8 will only have address from local machines, 234.0.0.0/5, 236.0.0.0/6 and 240.0.0.0/3 should never hit at all, as they're multicast and reserved.
posted by eriko at 8:36 AM on March 6, 2010
While these are interesting to view, they fail as information graphics. IP blocks are not distributed continuously against any interesting continuous variable. That is to say, if you perturb the first and second octets by small amounts, you may or may not remain on the same network or anywhere near it in a geographical, administrative, or topological sense. Thus, each blob merely represents a certain amount of traffic and nothing else. There is little information to be gained about the distribution of this traffic from the images, except that "more blobs = more traffic".
In other words, the iBiblio traffic logs are an interesting source of "noise" from which to generate some cool-looking pictures. There's nothing wrong with that.
However, if Mr. Heard had wanted to create images with infographical content, he could have created blobs according to the real latitude and longitude of the client IP: Decent IP-to-global-coordinate databases with fine enough resolution and accuracy for this project are free and easy to find (e.g., MaxMind's GeoLite City). Wouldn't it have been more engaging if the blobs represented global user traffic, positioned by actual geographical coordinates?
posted by libcrypt at 10:10 AM on March 6, 2010 [1 favorite]
In other words, the iBiblio traffic logs are an interesting source of "noise" from which to generate some cool-looking pictures. There's nothing wrong with that.
However, if Mr. Heard had wanted to create images with infographical content, he could have created blobs according to the real latitude and longitude of the client IP: Decent IP-to-global-coordinate databases with fine enough resolution and accuracy for this project are free and easy to find (e.g., MaxMind's GeoLite City). Wouldn't it have been more engaging if the blobs represented global user traffic, positioned by actual geographical coordinates?
posted by libcrypt at 10:10 AM on March 6, 2010 [1 favorite]
A lot of the AAA will never hit. 10.0.0.0/8 will only have address from local machines, 234.0.0.0/5, 236.0.0.0/6 and 240.0.0.0/3 should never hit at all, as they're multicast and reserved.
Once again, you have successfully explained a question I was not asking.
posted by DU at 4:32 PM on March 6, 2010
Once again, you have successfully explained a question I was not asking.
posted by DU at 4:32 PM on March 6, 2010
« Older Sleep is Death | Bingo! Or not. Newer »
This thread has been archived and is closed to new comments
posted by eriko at 3:22 PM on March 5, 2010