EFY Times  
Wednesday, April 23, 2014

 
GO
 
 

Viewing The World Through The Eyes Of Wikipedia
 
Home >> Infotech >> Trends
 
Viewing The World Through The Eyes Of Wikipedia  
 
   
Rate this news:  (0 Votes)
Thursday, June 21, 2012 SGI (NASDAQ: SGI), the trusted leader in technical computing has partnered with Kalev H. Leetaru of the University of Illinois to create the first-ever historical mapping and exploration of the full text contents of the English-language edition of Wikipedia, in time and space. The results include visualizations of modern history captured in under a day utilizing in-memory data-mining techniques. Loading the entire English language edition of Wikipedia into SGI® UV™ 2000, Mr. Leetaru was able to show how Wikipedia’s view of the world unfolded over the past two centuries. Location, year and the positive or negative sentiment have been tied to those references.






While several previous projects have mapped Wikipedia entries with manually assigned location metadata by an editor, these previous attempts only accounted for a tiny fraction of Wikipedia’s location information. This project unlocked the contents of the articles themselves, identifying every location and date in all four million pages and the connections among them to create a massive network.

“Seeing” Wikipedia in a brand new way

“This analysis allows the world to take a step back from the individual articles and text to gain a forest view of the tremendous knowledge captured in Wikipedia, not just a page by page tree view. We can watch how one of the largest collections of human knowledge has evolved and see what we could never see before, such as global sentiment at a certain time and place, or where there might be blind spots in the knowledge coverage, ” said Franz Aman, chief marketing officer and head of strategy, SGI. “We love to use Google Earth because we can zoom out and get the big picture view. With SGI UV 2, we can apply the same concept to Big Data to get the big picture on our Big Data.”

From this analysis, Wikipedia is seen to have four periods of growth in its historical coverage: 1001-1500 (Middle Ages), 1501-1729 (Early Modern Period), 1730-2003 (Age of Enlightenment), 2004-2011 (Wikipedia Era) and its continued growth appears to be focused on enhancing its coverage of historical events, rather than increased documenting of the present. The average tone of Wikipedia’s coverage of each year closely matches major global events, with the most negative period in the last 1,000 years being the American Civil War, followed by World War II. The analysis also shows that the “copyright gap” that blanks out most of the twentieth century in digitized print collections is not a problem with Wikipedia where there is steady exponential growth in its coverage from 1924 to today.

Enabling researchers to data-mine Big Data at the speed of Big Data

“The one-way nature of connections in Wikipedia, the lack of links, and the uneven distribution of Infoboxes, all point to the limitations of metadata-based data mining of collections like Wikipedia,” said Mr. Leetaru. “With SGI UV 2, the large shared memory available allowed me to ask questions of the entire dataset in near-real time. With a huge amount of cache-coherent shared memory at my fingertips, I could simply write a few lines of code and run it across the entire dataset, asking whatever questions came to mind. This isn’t possible with a scale-out computing approach. It’s very similar to using a word processor instead of using a typewriter – I can conduct my research in a completely different way, focusing on the outcomes, not the algorithms.”

The analytical approach

Loaded into SGI® UV™ 2000, the Big Brain computer, this massive dataset underwent full text geocoding and complete date-coding, using algorithms that identified every mention of every location and every date across the text of every entry on Wikipedia. More than 80 million locations and 42 million dates between 1000 AD and 2012 were extracted, averaging 19 locations and 11 dates per article (every 44 words and every 75 words, respectively). The connections between every date and every location were captured into a massive network representing Wikipedia’s view of history. With this instrumentation, Mr. Leetaru was able to perform near-real time analysis over the entire dataset on the SGI UV 2 to create visual maps throughout space and time to see not only how history unfolded but also the overall tone of the world throughout the last thousand years, and interactively testing a wide array of theories and research questions, all in less than a day’s work.

The New SGI UV: The Big Brain computer

SGI UV 2 product family enables users to find answers to the world’s most difficult problems on a system as easy to administer as a workstation. Built with Intel® Xeon® processor E5 family, running standard Linux, and supporting a wide range of storage options, SGI UV 2 offers a complete, industry-standard solution for no-limit computing.

With as little as 16 cores and 32 gigabytes of memory, SGI UV 2 can start small and seamlessly expand. This next generation platform doubles the number of cores (up to 4096 cores) and quadruples the amount of coherent main memory (up to 64 terabytes) from the previous generation, available for in-memory computing in a single-image system. SGI UV 2 can scale to eight petabytes of shared memory and at a peak I/O rate of four terabytes per second (14 PB/hour) it could ingest the entire contents of the U.S. Library of Congress print collection in less than three seconds.

SGI UV 2000 is available immediately. SGI UV 20 can be ordered today and will start shipping in August 2012. Pricing starts at $30,000 USD.



Print Email Post Comment 
(Total Views: 677)
 
Share
 
 
Infotech News
   
Are You An IT Security Professional? Here Are 8 Awesome Books!
6 Tools That Let You Browse The Net Anonymously!
10 Useful Firefox Add-ons For Hackers
13 Open Source Game Engines That Developers Can Use
Micromax Will Now Manufacture Smartphones In India
 
 
 
     
     
     
Press Release
     
OHL And Birchbox Collaborate For New ...
TeleYemen First To Offer iDirect-Based ...
JMR BlueStor Networked Storage Server ...
Syntel Launches New Digital Enterprise ...
Forget That Power Cuts Ever Existed!
Cota Wireless Power Solution Penetrates ...
Sigma Labs, Inc. Signs An Agreement ...
The Personal Connected Health Alliance ...
CSX Celebrates Earth Day With Focus On ...
Commercial Telematics Market Worth ...
HELLA Developing New Charging Systems ...
High-Performance, Low-Cost ...
LG And Team ENERGY STAR Invite Youth To ...
SHV Energy Simplifies Its Distribution ...
Firstsource And BSkyB Partnership Wins ...
Verizon 2014 Data Breach Investigations ...
TigerDirect Becomes Huawei Enterprise ...
Metamaterials Market Worth $643 Million ...
Power Transmission Executives To Gather ...
Sonata Software Strengthens Next ...
Industrial DeNOx Systems & Services ...
The Asia/Pacific PC Market Declined ...
Orthogonal Photolithography For ...
Lord Partners With CTD On Fusion Energy ...
Sony's New Digital Paper Goes Hollywood ...
 
VMware Reports First Quarter 2014 ...
Nanocoatings: The Global Market To 2020
India Emerges As Top Target For ...
Pacnet And China Telecom Corporation ...
OMRON Opens Its First Automation Centre ...
CyberArk Defines Maturity Model To ...
Blogmint Partners With Bonnito And ...
NEC Asia Pacific To Showcase Innovative ...
More Users, More Attacks: Kaspersky Lab ...
ASUS Republic Of Gamers Announces ...
Nanoelectronics: The Global Market To ...
NEC Launches Energy Management And ...
Interested In IT Hardware Or OEM ...
FarStone Announces An Easy-To-Use ...
VC Funding In Smart Grid Comes In At ...
Tintri Expands Global Channel Partner ...
SolderStar Is Ready To Highlight Its ...
Programmable Hex Voltage Supervisor Has ...
Rays Power Infra Participates In ...
Zentyal And Mandriva Join Forces To ...
Juniper Networks Reports Preliminary ...
Maxim Integrated's MAX11905, A 20-Bit, ...
Emerson Network Power Appoints Neoteric ...
10W GaN Amplifier For EW Applications
DOSarrest Releases Latest Generation ...
     
     
     
     
     
Most popular
 
Features
Are You An IT Security Professional? Here Are 8 Awesome Books!
Without proper resources and the skill set, maintaining the security of your network is a half-hearted attempt. ...
6 Tools That Let You Browse The Net Anonymously!
Being anonymous on the internet is even more important in this age of constant snooping by governments etc....
10 Useful Firefox Add-ons For Hackers
Well, if you're looking to have some fun with your Mozilla Firefox browser, these add-ons might just be the ones for you!...
13 Open Source Game Engines That Developers Can Use
Game engines have been modeled after games like Quake etc. and they are very useful for developers....
Micromax Will Now Manufacture Smartphones In India
The company has also started producing its tablet devices in India....
LG G3 May Have Google Now Like Feature Of Its Own
The device is expected to arrive in the second half of 2014 and will have a quad-HD display....
HTC May Have Secures Nexus 8 Contract
The Nexus 8 is expected to be unveiled at the Google’s I/O developer conference in June 2014....
Github Co-Founder Quits Over Harassment Case!
Tom Preston-Werner and his wife have been accused of sexism and gender-based discrimination....
Earth Day Special: Apple Is Recycling Its Products For Free!
Apple is even giving out gift cards for products that have resale value....
Xolo Launches Q1010i Affordable Smartphone AT Rs 13,499
The device is priced in the same range as the Moto G and the Micromax Canvas Turbo Mini....
7 Useful And Lightweight Markup Languages
Learning HTML may not be the easiest thing to do, but learning these markup languages is quite simple!...
15 Open Source Education Related Software
Here's how education has gained a lot from the open source world....
12 Software Used For Statistical Programming
The most known name in this segment is R, but there are many other programs and software that can be used....
12 New Or Lesser Known Programming Languages You Could Use
When speaking about programming languages, popular names like Java, Javascript, C, C++ etc. come to mind. But there are many useful but lesser known l...
14 Revision Control Tools And Git Clients
The more eyes that see a code, more innovation there can be in the software that you create. ...
 
  View All
Dialogue
 
HTC Is Strong And There Are No Plans Of Sale Now Or In Future, Says HTC's Senior Director-Marketing
Atithya Amaresh from EFYTimes had an exclusive chat with Sirpa H. Ikola, senior director, Marketing, South Asia, HTC about its devices and its plans w...
“Cloud And Hybrid Hosting Are The Way To Go!”
Diksha P Gupta from Open Source For You spoke to Anil Chandaliya, chief network administrator, ESDS, about how customers can play safe while dealing w...
"Linux Interoperability Has Been Enabled Through Work Across Microsoft"
Diksha P Gupta from the OSFY team spoke to Dr K Y Srinivasan, principal architect at Microsoft, about how the company is moving ahead on the open trac...
"For Modern Day Tablets And Smartphones, Android Has Become A Default"
Diksha P Gupta from EFYTimes.com spoke to Indrajit Sabharwal, managing director, Simmtronics Semiconductors Ltd, about the compan...
"Torvalds' Blunt, Harsh Criticism Helps Open Source Grow"
Steve Ballmer called Linux 'cancer, which attached itself... to everything it touches,' in 2001 and oh-so-rightly. ...
   
  View All
Videos
 
First Look: LG Optimus G
The phone sports a high-end display and comes powered by a powerful processor. ...
Create QR-Codes For Free
TEC-IT releases the freeware QR-Code Studio to provide a quick and convenient way of QR code creation for every application scenario....
DoT Secretary Shares Plans For Growth Of Telecom Sector
M.F. Farooqui has recently taken charge as secretary, Department of Telecom....
Hands-On: Sony Xperia Z
Xperia Z is Sony's first entrant model in the big-screen smartphone category. ...
Hands On: Videocon A30 Smartphone
Videocon, the consumer electronics company which is known for its refrigerators, washing machine and air-conditioner has unveiled its Android-based sm...
   
View All
   
 
MWC 2014
 
MWC 2014: Tablet Lets People Feel Textures On Its Screen
Now feel what you see on your tablet, by way of ultrasonic waves....
MWC 2014: 4K Android Tablet Games To Kill Consoles, iPad
Tablet makers like Samsung want to beat the iPad by making 4K tabs. ...
MWC 2014: This Vodafone Backpack Helps Get Network In Disaster Situations
Two engineers of Vodafone New Zealand displayed the "mini" mobile network by Vodafone substructure in a backpack. ...
MWC 2014: Wilocity Chipset To Bring 'Lightening' Speed To Mobile Phones
Wilocity has developed a WiGig chipset for mobile phones that will bring lightning-fast wireless capability ...
MWC 2014: Samsung Introduces Octacore, Hexacore Chipsets
The Korean giant, Samsung unveiled two new octacore and hexacore chipsets at MWC 2014, in Barcelona. The company previously used Exynos 5 Octa 5410 ch...
MWC 2014: Alcatel Unveils PIXI 7 Tablet With Android 4.4
Alcatel arrived at the MWC 2014 with yet another low cost Android tablet, priced at $110 ...
MWC 2014: LG F70 Unveiled With Android 4.4
LG announced the new LTE-enabled Android smartphone, LG F70, at MWC 2014 in Barcelona. ...
   
View All
   
 
Events
 
19th Consumer Electronic Imaging Fair To Be Held On ...

View All
   
   
 
 

home archives contact us advertise with us
           
Magazines Portals Directories Events News Verticals Educational Institute  
Electronics for You
Open Source for You
Facts for You
Electronics Bazaar
electronicsforu.com
efytimes.com
bpotimes.com
linuxforu.com
Electronics Annual Guide
EFY EXPO
EFY Awards
EduTech Expo
OSIDAYS Expo
Electronics
Infotech
Linux & Open Source
Consumer Electronics
Science & Technology
BPO
EFY Techcenter 
 
 
© Copyright 2014 EFY Enterprises Pvt. Ltd.
All rights reserved. Reproduction in whole or in part in any form or medium without written permission is prohibited.
Usage of the content from the web site is subject to Terms and Conditions