Building an Artificial America in a Supercomputer: Synthetic Citizens to Help Social Scientists
April 23rd, 2010I found a one paragraph version of this information linked from Technofascism Blog and I went looking for more. The full text from the 2008 IEEE piece is below.
Knowing what we know about the surveillance data that’s being collected, I see no reason why the “simulation” couldn’t be setup to run on real data, for all Americans. MAIN CORE is already tracking about eight million Americans. Yes, 300 million is a lot more than eight million, but what’s a couple of orders of magnitude when you have an unknown budget and two 1.5 million square foot data centers (Utah / Texas)? Besides, the article below states that the Virginia Tech team was planning to move their version of the simulation to 300 million people.
I know. You’re probably thinking that the machine is going to fill up with blurry pictures of kids’ birthday parties, pikelet recipes, tax returns, trips to WalMart and McDonalds and seemingly infinite amounts of nonsense. That’s right, but the technocrats don’t see it that way. The technocrats see a way to maintain and expand their power. In their warped worldview, this system would offer a way to keep an eye on the freaks, and to set off an alarm when any of those freaks deviate from their normal routine that has been quantified by some Magic 8 Ball algorithm.
The level of granularity could certainly vary, from one individual to another, to conserve datacenter resources. For example, maybe They won’t bother storing your email unless they see firearms related purchases. On the other hand, everyone who’s registered to vote for third party candidates might automatically be added to one of the higher echelon shit lists. The same goes for people who send encrypted email. Or, say the wrong phrase on the phone and the next time you go to the airport, a three hundred pound diabetic, who’s wearing a polyester uniform that smells like fried chicken, will be grabbing your crotch, looking for the bomb that you might have sewn into your underwear. This would all be completely automated.
Now that we’re all considered to be potential terrorists, automation is key.
Via: IEEE:
At a rally in rural North Carolina during the 2008 U.S. presidential election campaign, Alaska governor Sarah Palin infamously said that there was a ”real America” and presumably a fake one. Though she was the butt of jokes for the remainder of the campaign, in a way Palin was right. One state over, a team of computer scientists and a physicist from Virginia Polytechnic Institute and State University (Virginia Tech), in Blacksburg, Va., was creating a fake America of its own.
The group has designed what it claims is the largest, most detailed, and realistic computer model of the lives of about 100 million Americans, using enormous amounts of publicly available demographic data. The model’s makers hope the simulation will shed light on the effects of human comings and goings, such as how a contagion spreads, a fad grows, or traffic flows. In the next six months, the researchers expect to be able to simulate the movement of all 300 million residents of the United States.
As many as 163 variables, mostly drawn from the U.S. Census, come into play for each synthetic American. Called EpiSimdemics, the model almost perfectly matches the demographic attributes of groups with at least 1500 people, according to Keith Bisset, a senior research associate who works on the simulation’s software. The software generates fake people to populate real communities and assigns each person characteristics such as age, education level, and occupation to mirror local statistics derived from the most recent national census. In accordance with the data, some individuals are clustered into families, while others live alone.
Every synthetic household is assigned a real street address, based on land-use information from Navteq, a digital-mapping company. Using data from a business directory, each employed individual is matched to a specific job within a reasonable commute from the person’s home. Similarly, actual schools, supermarkets, and shopping centers identified through Navteq’s database are also linked to households based on their proximity to the home. When an artificial American goes grocery shopping, the simulation algorithm assigns probabilities that he or she will visit one store over another, adding an element of unpredictability to a person’s daily schedule.
Though the simulation is not restricted to the study of contagious diseases, a major application so far has been modeling how a flu epidemic might propagate through different regions. To accomplish this, each person has an embedded model of how he or she might respond to the flu, with probabilities derived from epidemiological data and the person’s age and general health.
Now imagine that a few of those model citizens become infected with the flu. Discerning the impact of millions of unique behaviors on infectious disease patterns involves performing many millions of tiny but intertwined calculations. As the sample population grows, those calculations quickly become a hefty computing task. ”The lack of symmetry and regularity makes these types of problems very different from traditional physics problems that require large computing power,” says Stephen Eubank, the physicist on the project.”We have to address all kinds of scaling issues with the very irregular communication patterns in the model.”
To break up the problem into computable chunks, the software treats each person and each location as a separate set of calculations. In a flu experiment, the algorithm starts with a person in a given health state. If the person is recently infected, his or her health will steadily deteriorate over the course of a day. The victim may begin to show symptoms, and at a certain time the person will become contagious. The algorithm stores a record of each person’s health state as it was at each of the locations he or she visited throughout the day.
Once they have been compiled, those health records are dispatched to the modules of code representing the locations visited by each person. The algorithm checks all the interactions among people who were at a location at the same time and determines the number of new infections that arose from the day’s encounters. After those calculations are finished, the location module sends the updated infection data back to the modules representing each person. Each person and each location is calculated on a unique processing element so that many parts of the algorithm can be computed in parallel on a supercomputer. ”This brute-force computation changes qualitatively how we think about the problem,” says Christopher Barrett, who works on the project and is the director of Virginia Tech’s Network Dynamics and Simulation Science Laboratory.
For a recent experiment on flu transmission over three months in the Chicago area, for example, the researchers ran 10 iterations of the simulation in 30 minutes each on a cluster of three dozen machines. By virtue of organizing the problem into people and location entities, they were able to speed up the software substantially; using the algorithms available five years ago, a single simulation on comparable machines would have taken up to six hours.
Each run of the simulation reveals the path that the virus took through the population, which could help identify particularly vulnerable subpopulations and the most effective public health interventions. The simulation can also indicate the number of infections each day over the course of the study period, which is important because the peak in infections indicates the biggest burden on the city’s health system. To simulate flu transmission across the entire country, the computer scientists plan to incorporate human air travel next, using flight data from the International Air Transport Association on the number of flights connecting any two hubs. ”The vision is for a Google-like interface, where you approach the system and ask it a question,” says Barrett. ”The framework is there, and now we’re pushing the system to larger and larger scales.”