EFY Times  
Wednesday, April 16, 2014

 
GO
 
 

Viewing The World Through The Eyes Of Wikipedia
 
Home >> Infotech >> Trends
 
Viewing The World Through The Eyes Of Wikipedia  
 
   
Rate this news:  (0 Votes)
Thursday, June 21, 2012 SGI (NASDAQ: SGI), the trusted leader in technical computing has partnered with Kalev H. Leetaru of the University of Illinois to create the first-ever historical mapping and exploration of the full text contents of the English-language edition of Wikipedia, in time and space. The results include visualizations of modern history captured in under a day utilizing in-memory data-mining techniques. Loading the entire English language edition of Wikipedia into SGI® UV™ 2000, Mr. Leetaru was able to show how Wikipedia’s view of the world unfolded over the past two centuries. Location, year and the positive or negative sentiment have been tied to those references.






While several previous projects have mapped Wikipedia entries with manually assigned location metadata by an editor, these previous attempts only accounted for a tiny fraction of Wikipedia’s location information. This project unlocked the contents of the articles themselves, identifying every location and date in all four million pages and the connections among them to create a massive network.

“Seeing” Wikipedia in a brand new way

“This analysis allows the world to take a step back from the individual articles and text to gain a forest view of the tremendous knowledge captured in Wikipedia, not just a page by page tree view. We can watch how one of the largest collections of human knowledge has evolved and see what we could never see before, such as global sentiment at a certain time and place, or where there might be blind spots in the knowledge coverage, ” said Franz Aman, chief marketing officer and head of strategy, SGI. “We love to use Google Earth because we can zoom out and get the big picture view. With SGI UV 2, we can apply the same concept to Big Data to get the big picture on our Big Data.”

From this analysis, Wikipedia is seen to have four periods of growth in its historical coverage: 1001-1500 (Middle Ages), 1501-1729 (Early Modern Period), 1730-2003 (Age of Enlightenment), 2004-2011 (Wikipedia Era) and its continued growth appears to be focused on enhancing its coverage of historical events, rather than increased documenting of the present. The average tone of Wikipedia’s coverage of each year closely matches major global events, with the most negative period in the last 1,000 years being the American Civil War, followed by World War II. The analysis also shows that the “copyright gap” that blanks out most of the twentieth century in digitized print collections is not a problem with Wikipedia where there is steady exponential growth in its coverage from 1924 to today.

Enabling researchers to data-mine Big Data at the speed of Big Data

“The one-way nature of connections in Wikipedia, the lack of links, and the uneven distribution of Infoboxes, all point to the limitations of metadata-based data mining of collections like Wikipedia,” said Mr. Leetaru. “With SGI UV 2, the large shared memory available allowed me to ask questions of the entire dataset in near-real time. With a huge amount of cache-coherent shared memory at my fingertips, I could simply write a few lines of code and run it across the entire dataset, asking whatever questions came to mind. This isn’t possible with a scale-out computing approach. It’s very similar to using a word processor instead of using a typewriter – I can conduct my research in a completely different way, focusing on the outcomes, not the algorithms.”

The analytical approach

Loaded into SGI® UV™ 2000, the Big Brain computer, this massive dataset underwent full text geocoding and complete date-coding, using algorithms that identified every mention of every location and every date across the text of every entry on Wikipedia. More than 80 million locations and 42 million dates between 1000 AD and 2012 were extracted, averaging 19 locations and 11 dates per article (every 44 words and every 75 words, respectively). The connections between every date and every location were captured into a massive network representing Wikipedia’s view of history. With this instrumentation, Mr. Leetaru was able to perform near-real time analysis over the entire dataset on the SGI UV 2 to create visual maps throughout space and time to see not only how history unfolded but also the overall tone of the world throughout the last thousand years, and interactively testing a wide array of theories and research questions, all in less than a day’s work.

The New SGI UV: The Big Brain computer

SGI UV 2 product family enables users to find answers to the world’s most difficult problems on a system as easy to administer as a workstation. Built with Intel® Xeon® processor E5 family, running standard Linux, and supporting a wide range of storage options, SGI UV 2 offers a complete, industry-standard solution for no-limit computing.

With as little as 16 cores and 32 gigabytes of memory, SGI UV 2 can start small and seamlessly expand. This next generation platform doubles the number of cores (up to 4096 cores) and quadruples the amount of coherent main memory (up to 64 terabytes) from the previous generation, available for in-memory computing in a single-image system. SGI UV 2 can scale to eight petabytes of shared memory and at a peak I/O rate of four terabytes per second (14 PB/hour) it could ingest the entire contents of the U.S. Library of Congress print collection in less than three seconds.

SGI UV 2000 is available immediately. SGI UV 20 can be ordered today and will start shipping in August 2012. Pricing starts at $30,000 USD.



Print Email Post Comment 
(Total Views: 672)
 
Share
 
 
Infotech News
   
10 Super Useful Free Apps For Data Mining
15 Open Source Apps For Scientific Application
12 Amazing Free Data Analysis Software For You
RS Components & Fairchild Semiconductor Collaborate To Offer Evaluation Boards In DesignSpark PCB Format
Broadcom Doubles Wi-Fi Speed Of Devices With First Six Stream 802.11Ac MIMO Platform
 
 
 
     
     
     
Press Release
     
RS Components & Fairchild Semiconductor ...
Samsung Galaxy S5 Carries Astronomical ...
World's Leading Cloud And Content ...
Protecode’s Proactive Open Source ...
New Bluetooth Accessory Brings ...
MicroAd Singapore Announces 'MicroAd ...
C&K Components Develops Two Series Of ...
Broadcom Doubles Wi-Fi Speed Of Devices ...
Anritsu Introduces First Broadband ...
Adesso(R) Compagno(TM) Air Bluetooth(R) ...
TripAdvisor Is Now The World's Most ...
Zebra Technologies To Acquire ...
MapmyIndia Locate App To Secure Cab ...
Garmin, The Global Leader In GPS ...
Research And Markets: Global Printer ...
Research And Markets: Global 3D Flat ...
HID Global Wins AIM Award And RFID ...
United States Ultrasonic Cleaning ...
SCRA Technology Ventures Announces ...
Research And Markets: Global Cleanroom ...
Global NFC Chips Market 2014-2018 With ...
NASSCOM Aims Seamless Growth From ...
Research And Markets: Global Consumer ...
Research And Markets: Printer Market In ...
Hitachi Consulting Demonstrates ...
 
Awards, Conference Call Schedules, New ...
BOMcheck Wins Environmental Leader 2014 ...
Smartphone Apps Processor Revenue To ...
Tablet Apps Processor Revenue To Reach ...
ABB Incorporates Maxwell Technologies ...
Eberspaecher Invests $122 Million, ...
Text Analytics Market Worth $4.90 ...
Global LCD Glass Market Report
Google And Citrix Collaborate On New ...
Global Land Mobile Radio (LMR) System ...
Total UK Video Entertainment Spend ...
Location Based Services (LBS) Market ...
Advanced Functional Material Market For ...
Conductive Ink Market By Application & ...
Gesture Recognition For Smart TV Market ...
The Rare Earths Industry Worldwide ...
Next Generation Biometric Market Worth ...
ASUS Announces All New Fonepad 7 Dual ...
UBM Tech's EE Live! 2014 Post ...
Vuzix Provides Business Update And ...
China Shale Gas Sector Analysis And ...
Gartner Says PPM Leaders Must Prepare ...
Chinese Automotive Supplier Report 2014
Flower Technology Partners With ...
OPPO Mobiles Targets A Diverse Product ...
     
     
     
     
     
Most popular
 
Features
10 Super Useful Free Apps For Data Mining
Data Mining is perhaps the most important aspects of Big Data analysis....
15 Open Source Apps For Scientific Application
Want to solve chemical equations on your computer? Here's how....
12 Amazing Free Data Analysis Software For You
All that data you collect from users can be more useful to your enterprise than you think. Professionals use these apps to prove it....
13 Open Source Software For Mathematics, Computer Algebra and Statistical Analysis
Computers have advanced to a point where apps are available for almost everything. Exhibit A as follows....
10 Open Source VoIP Software!
Voice Over Internet Protocol (VoIP) is definitely a good service to reduce calling costs....
50+ Tutorials That'll Tell You About Cloud Computing
Learn the nooks and crannies of cloud computing to optimise your own outputs....
9 Open Source Screencasting Apps For You!
Want to put out some tutorials of your own out there? Use these!...
Happy Designing: Here Are 8 Free CAD Apps For Linux
The world of Linux is laden with opportunity. Here's something for the CAD designers!...
86 Tutorials About Network Security!
Is your network secure? Are you using the best methods to secure it? Here's some help....
Ethical Hacking Basics: Here Are 10 Awesome SSH Hacks
Play around with SSH with these cool hacks!...
A List Of 9 Awesome Mindmap Tools
Have you tried Mind mapping yet? It's high time you should give it a shot! ...
A List Of 10 Essential Linux Books For You!
Dive into the wonderful world of Linux with these awesome books! ...
80+ Tutorials For Web Development Purposes!
Web developers today need to be a lot more skillful and websites need to be responsive to many more platforms....
Are You An SQL Developer? Here Are 14 Good MySQL Database Books For You!
The following books will help you in better understanding the relational database management system (RDBMS). ...
10 Free And Open Source Web Browsers
Mozilla and Google are by far the leaders when it comes to web browsers, but here are some other options too....
 
  View All
Dialogue
 
HTC Is Strong And There Are No Plans Of Sale Now Or In Future, Says HTC's Senior Director-Marketing
Atithya Amaresh from EFYTimes had an exclusive chat with Sirpa H. Ikola, senior director, Marketing, South Asia, HTC about its devices and its plans w...
“Cloud And Hybrid Hosting Are The Way To Go!”
Diksha P Gupta from Open Source For You spoke to Anil Chandaliya, chief network administrator, ESDS, about how customers can play safe while dealing w...
"Linux Interoperability Has Been Enabled Through Work Across Microsoft"
Diksha P Gupta from the OSFY team spoke to Dr K Y Srinivasan, principal architect at Microsoft, about how the company is moving ahead on the open trac...
"For Modern Day Tablets And Smartphones, Android Has Become A Default"
Diksha P Gupta from EFYTimes.com spoke to Indrajit Sabharwal, managing director, Simmtronics Semiconductors Ltd, about the compan...
"Torvalds' Blunt, Harsh Criticism Helps Open Source Grow"
Steve Ballmer called Linux 'cancer, which attached itself... to everything it touches,' in 2001 and oh-so-rightly. ...
   
  View All
Videos
 
First Look: LG Optimus G
The phone sports a high-end display and comes powered by a powerful processor. ...
Create QR-Codes For Free
TEC-IT releases the freeware QR-Code Studio to provide a quick and convenient way of QR code creation for every application scenario....
DoT Secretary Shares Plans For Growth Of Telecom Sector
M.F. Farooqui has recently taken charge as secretary, Department of Telecom....
Hands-On: Sony Xperia Z
Xperia Z is Sony's first entrant model in the big-screen smartphone category. ...
Hands On: Videocon A30 Smartphone
Videocon, the consumer electronics company which is known for its refrigerators, washing machine and air-conditioner has unveiled its Android-based sm...
   
View All
   
 
MWC 2014
 
MWC 2014: Tablet Lets People Feel Textures On Its Screen
Now feel what you see on your tablet, by way of ultrasonic waves....
MWC 2014: 4K Android Tablet Games To Kill Consoles, iPad
Tablet makers like Samsung want to beat the iPad by making 4K tabs. ...
MWC 2014: This Vodafone Backpack Helps Get Network In Disaster Situations
Two engineers of Vodafone New Zealand displayed the "mini" mobile network by Vodafone substructure in a backpack. ...
MWC 2014: Wilocity Chipset To Bring 'Lightening' Speed To Mobile Phones
Wilocity has developed a WiGig chipset for mobile phones that will bring lightning-fast wireless capability ...
MWC 2014: Samsung Introduces Octacore, Hexacore Chipsets
The Korean giant, Samsung unveiled two new octacore and hexacore chipsets at MWC 2014, in Barcelona. The company previously used Exynos 5 Octa 5410 ch...
MWC 2014: Alcatel Unveils PIXI 7 Tablet With Android 4.4
Alcatel arrived at the MWC 2014 with yet another low cost Android tablet, priced at $110 ...
MWC 2014: LG F70 Unveiled With Android 4.4
LG announced the new LTE-enabled Android smartphone, LG F70, at MWC 2014 in Barcelona. ...
   
View All
   
 
Events
 
19th Consumer Electronic Imaging Fair To Be Held On ...

View All
   
   
 
 

home archives contact us advertise with us
           
Magazines Portals Directories Events News Verticals Educational Institute  
Electronics for You
Open Source for You
Facts for You
Electronics Bazaar
electronicsforu.com
efytimes.com
bpotimes.com
linuxforu.com
Electronics Annual Guide
EFY EXPO
EFY Awards
EduTech Expo
OSIDAYS Expo
Electronics
Infotech
Linux & Open Source
Consumer Electronics
Science & Technology
BPO
EFY Techcenter 
 
 
© Copyright 2014 EFY Enterprises Pvt. Ltd.
All rights reserved. Reproduction in whole or in part in any form or medium without written permission is prohibited.
Usage of the content from the web site is subject to Terms and Conditions