I'm not sure what you tying to capture here. When you say you have data that shows gross traffic - but I wonder what that traffic is. A few thoughts on this (dredging my old ISP days): Data on the internet flows either within networks (Autonomous Systems - generally switched flow) or between networks (where routing comes into play at peering points (either public or private) / NAPs (Network Access Point) see BGP generally). But if we look at flow between say London and Paris one could try to estimate data going into LINX (the main London NAP) and out of SFINX (the Paris one). But in Europe there are lots of international / pan European single AS networks, so this traffic will not show up on in some analysis unless it has to ingress or exit from / to a peer. Then there is the fact that traffic that does exit at say LINX may in fact just be hoping transatlantic. And lastly there are many more peering points just in the two cities I have mentioned. Unless things have moved on since I looked at this stuff, you can't geographically source IP packets across networks (if you have your own network sure you can do that). I'm sure you can get data that basically shows very gross traffic in and out of the major 'nodes' around the world so you could work out relative traffic levels, but an actual city to city matrix flow - I'd love to see that, in fact I'd possibly pay good money to see that. Sorry if this misses the point and /or I'm a few years behind in the tech. Ren www.renreynolds.com -----Original Message----- From: air-l-admin@aoir.org [mailto:air-l-admin@aoir.org] On Behalf Of Justin Rosenthal Sent: 19 May 2004 21:28 To: air-l@aoir.org Subject: [Air-l] Data Hello all, I am interested in hearing any thoughts you have on a data problem that I have, that I am sure many of you have approached, and which is, of course, a result of the structure of the Internet itself. In my ideal world, I would be able to build a relational database of data traffic between the largest cities worldwide. The data I have found shows gross data traffic between nodes, which includes traffic originated in third-party cities and destined for fourth-party cities, for example, and which does not provide an estimate of the traffic originated in 3 and destined for 4. This means that the data doesn't relate every node in the city system to every other in terms of network traffic inbound and outbound. Have you approached this problem? Do you have any thoughts on how currently available data can be patched for network analysis, or how such a relational database could be built in the future? Many thanks, Justin _____________________________________ Justin Rosenthal MA Candidate - Social Science University of Chicago jrr@uchicago.edu _______________________________________________ Air-l mailing list Air-l@aoir.org http://www.aoir.org/mailman/listinfo/air-l