EFY Times  
Friday, October 31, 2014

 
GO
 
 

Viewing The World Through The Eyes Of Wikipedia
 
Home >> Infotech >> Trends
 
Viewing The World Through The Eyes Of Wikipedia  
 
   
Rate this news:  (0 Votes)
Thursday, June 21, 2012 SGI (NASDAQ: SGI), the trusted leader in technical computing has partnered with Kalev H. Leetaru of the University of Illinois to create the first-ever historical mapping and exploration of the full text contents of the English-language edition of Wikipedia, in time and space. The results include visualizations of modern history captured in under a day utilizing in-memory data-mining techniques. Loading the entire English language edition of Wikipedia into SGI® UV™ 2000, Mr. Leetaru was able to show how Wikipedia’s view of the world unfolded over the past two centuries. Location, year and the positive or negative sentiment have been tied to those references.






While several previous projects have mapped Wikipedia entries with manually assigned location metadata by an editor, these previous attempts only accounted for a tiny fraction of Wikipedia’s location information. This project unlocked the contents of the articles themselves, identifying every location and date in all four million pages and the connections among them to create a massive network.

“Seeing” Wikipedia in a brand new way

“This analysis allows the world to take a step back from the individual articles and text to gain a forest view of the tremendous knowledge captured in Wikipedia, not just a page by page tree view. We can watch how one of the largest collections of human knowledge has evolved and see what we could never see before, such as global sentiment at a certain time and place, or where there might be blind spots in the knowledge coverage, ” said Franz Aman, chief marketing officer and head of strategy, SGI. “We love to use Google Earth because we can zoom out and get the big picture view. With SGI UV 2, we can apply the same concept to Big Data to get the big picture on our Big Data.”

From this analysis, Wikipedia is seen to have four periods of growth in its historical coverage: 1001-1500 (Middle Ages), 1501-1729 (Early Modern Period), 1730-2003 (Age of Enlightenment), 2004-2011 (Wikipedia Era) and its continued growth appears to be focused on enhancing its coverage of historical events, rather than increased documenting of the present. The average tone of Wikipedia’s coverage of each year closely matches major global events, with the most negative period in the last 1,000 years being the American Civil War, followed by World War II. The analysis also shows that the “copyright gap” that blanks out most of the twentieth century in digitized print collections is not a problem with Wikipedia where there is steady exponential growth in its coverage from 1924 to today.

Enabling researchers to data-mine Big Data at the speed of Big Data

“The one-way nature of connections in Wikipedia, the lack of links, and the uneven distribution of Infoboxes, all point to the limitations of metadata-based data mining of collections like Wikipedia,” said Mr. Leetaru. “With SGI UV 2, the large shared memory available allowed me to ask questions of the entire dataset in near-real time. With a huge amount of cache-coherent shared memory at my fingertips, I could simply write a few lines of code and run it across the entire dataset, asking whatever questions came to mind. This isn’t possible with a scale-out computing approach. It’s very similar to using a word processor instead of using a typewriter – I can conduct my research in a completely different way, focusing on the outcomes, not the algorithms.”

The analytical approach

Loaded into SGI® UV™ 2000, the Big Brain computer, this massive dataset underwent full text geocoding and complete date-coding, using algorithms that identified every mention of every location and every date across the text of every entry on Wikipedia. More than 80 million locations and 42 million dates between 1000 AD and 2012 were extracted, averaging 19 locations and 11 dates per article (every 44 words and every 75 words, respectively). The connections between every date and every location were captured into a massive network representing Wikipedia’s view of history. With this instrumentation, Mr. Leetaru was able to perform near-real time analysis over the entire dataset on the SGI UV 2 to create visual maps throughout space and time to see not only how history unfolded but also the overall tone of the world throughout the last thousand years, and interactively testing a wide array of theories and research questions, all in less than a day’s work.

The New SGI UV: The Big Brain computer

SGI UV 2 product family enables users to find answers to the world’s most difficult problems on a system as easy to administer as a workstation. Built with Intel® Xeon® processor E5 family, running standard Linux, and supporting a wide range of storage options, SGI UV 2 offers a complete, industry-standard solution for no-limit computing.

With as little as 16 cores and 32 gigabytes of memory, SGI UV 2 can start small and seamlessly expand. This next generation platform doubles the number of cores (up to 4096 cores) and quadruples the amount of coherent main memory (up to 64 terabytes) from the previous generation, available for in-memory computing in a single-image system. SGI UV 2 can scale to eight petabytes of shared memory and at a peak I/O rate of four terabytes per second (14 PB/hour) it could ingest the entire contents of the U.S. Library of Congress print collection in less than three seconds.

SGI UV 2000 is available immediately. SGI UV 20 can be ordered today and will start shipping in August 2012. Pricing starts at $30,000 USD.



Print Email Post Comment 
(Total Views: 753)
 
Share

 
 
Infotech News
   
10 Useful Bootstrap Editors To Build Responsive Websites
Bangalore City Is India's First Wi-Fi Connected Railway Station!
Panasonic Introduces Android Based Toughpad FZ-B2 Tablet
HP Introduces Sprout, An All-In-One PC With Projector and 3D Scanner
OSI Days 2014 Calls CIOs And IT Heads To Share Their Success Stories Of Open Source Deployments
 
 
 
     
     
     
     
     
     
Most popular
 

Daily

Weekly

Canonical Releases Ubuntu 14.10!
Wipro To Hire 10,000 People For Open Source Program! Apply Here...
Top 10 Programmers Of All Time!
MongoDB Recognized As The Only “Challenger” In The Gartner 2014 Magic Quadrant For Operational Database Management Systems
Ebola Virus Reaches Internet World In Form Of Cyber Attacks!
Android KitKat Vs Lollipop: Here's The Difference!
Samsung Launches Galaxy Mega 2 Smartphone In India
Diwali Disocunt Offer On 10 Smartphones From Top Brands
Innovation Direct At Forefront Of Marketing Efforts For Ferris Emergency App Invention
TCS Focusing On Softskills Development Of Employees
Must-Follow Tips While Buying Online!
Know The 10 Possibilities In The Future Programming World
New White Paper From Boyd Corporation Discusses Noise And Vibration Analysis And Control
ASSET InterTech And Mentor Graphics IJTAG Interoperability Empowers Two-Way Validation Flow Between Chips And Boards
Google's New Inbox Enriches Email Experience
10 Best Mobile App Analytics Platforms
 
   

Overall

Agility Systems Deals in:

Document Management System for law firms
Features
10 Useful Bootstrap Editors To Build Responsive Websites
Each editor is different but all of them feature some drag and drop facility through a huge library of components. ...
10 Tips For Developers To Speed Up Websites
The easiest way is to reduce HTTP requests, which can be done through front-end development....
Top 10 Free Rich Text Editors For Developers
There are tons of rich text editors available on Internet but you need to find out the best ones to improve your development skills....
10 Best Tools For Creating And Prototyping Mobile Apps
Here we'll provide a list of 10 best available resources to help you in creating websites, web apps and mobile apps which can be used for prototyping ...
10 Hottest Skills An IT Professional Requires In This Demanding Market
With the fastest growing IT industry, required IT skills are also growing at the fastest possible way. Take a look at the top 10 list here....
8 Best Markdown Editors With Support To Google Drive, Dropbox
The likes of Google Drive always don't support Markdown documents which causes huge pain for the users but thanks to those tools which come in to fill...
Boost Security Of Your WiFi Network With Kali Linux: Learn Three Major Steps
Its toolkit allows you to crack Wi-Fi passwords, create false networks and detect vulnerabilities. ...
10 Best Mobile App Analytics Platforms
These tools will tell you how to measure the basics like installs, rankings, revenues, purchases and competitor rankings and how to deal with issues ...
Know The 10 Possibilities In The Future Programming World
The coming years of programming hold several predictions and you should be well aware about how the world of technology is likely to evolve in next fi...
Top 10 Programmers Of All Time!
Here we will recall 10 greatest programmers of all time and their immense contribution....
Top 8 Books On Hadoop Technology
Let's discuss 8 best books to learn Hadoop and how to get started with it....
20 Tools And Resources For Building And Testing Regular Expressions
Here we'll provide a list of 20 best tools and resources which will help you in writing regular expressions in a more streamlined way....
10 Scariest Hacking Incidents In The World!
A detailed report about the scariest side of hacking was recently released by WebHostingBuzz which gives us a list 10 scariest hacks and also possibi...
8 Best SVG Tools For Web Developers
There are some useful SVG tools which will of great help for any aspiring SVG developer. Take a look at the 8 best ones....
Top 8 Resources To Get Started With Go Programming Language
Go is a very well-structured language which has a syntax like the C language and it's very easy to learn....
 
  View All
Dialogue
 
10 Questions That Google Never Asks While Hiring!
Here is a list of 10 questions that Google just banned from their interviews....
For Enjay, Open Source Technology Is A Way Of Life
An entirely open source-based company, Enjay IT Solutions, has built itself a reputation in the OSS domain....
Switching To Tizen Doesn’t Mean We Are Abandoning Android: Samsung
The company has worked to build Tizen up from scratch and has now introduced it to developers and the general public with its latest range of wearable...
Venturing Into The Cloud? Develop A Customised Cloud Strategy First!
Diksha P Gupta speaks to Rushikesh Jadhav, cloud evangelist, ESDS Software Solution Pvt Ltd, on how the cloud has changed the way compan...
HTC Is Strong And There Are No Plans Of Sale Now Or In Future, Says HTC's Senior Director-Marketing
Atithya Amaresh from EFYTimes had an exclusive chat with Sirpa H. Ikola, senior director, Marketing, South Asia, HTC about its devices and its plans w...
   
  View All
Videos
 
First Look: LG Optimus G
The phone sports a high-end display and comes powered by a powerful processor. ...
Create QR-Codes For Free
TEC-IT releases the freeware QR-Code Studio to provide a quick and convenient way of QR code creation for every application scenario....
DoT Secretary Shares Plans For Growth Of Telecom Sector
M.F. Farooqui has recently taken charge as secretary, Department of Telecom....
Hands-On: Sony Xperia Z
Xperia Z is Sony's first entrant model in the big-screen smartphone category. ...
Hands On: Videocon A30 Smartphone
Videocon, the consumer electronics company which is known for its refrigerators, washing machine and air-conditioner has unveiled its Android-based sm...
   
View All
   
 
IFA 2014
 
IFA 2014: LG Launches 34-Inches Curved Monitor
The company is yet to confirm price and availability of the device....
IFA 2014: Intel Launches First Core M Range Of Processors
This range of processors is tailor-made for 2-in-1 devices which include a tablet and a laptop....
MWC 2014: Tablet Lets People Feel Textures On Its Screen
Now feel what you see on your tablet, by way of ultrasonic waves....
MWC 2014: 4K Android Tablet Games To Kill Consoles, iPad
Tablet makers like Samsung want to beat the iPad by making 4K tabs. ...
MWC 2014: This Vodafone Backpack Helps Get Network In Disaster Situations
Two engineers of Vodafone New Zealand displayed the "mini" mobile network by Vodafone substructure in a backpack. ...
MWC 2014: Wilocity Chipset To Bring 'Lightening' Speed To Mobile Phones
Wilocity has developed a WiGig chipset for mobile phones that will bring lightning-fast wireless capability ...
MWC 2014: Samsung Introduces Octacore, Hexacore Chipsets
The Korean giant, Samsung unveiled two new octacore and hexacore chipsets at MWC 2014, in Barcelona. The company previously used Exynos 5 Octa 5410 ch...
   
View All
   
 
Events
 
19th Consumer Electronic Imaging Fair To Be Held On ...

View All
   
   
 
 

home archives contact us advertise with us
           
Magazines Portals Directories Events News Verticals Educational Institute  
Electronics for You
Open Source for You
Electronics Bazaar
electronicsforu.com
efytimes.com
opensourceforu.com
electronicsb2b.com
Electronics Annual Guide
EFY EXPO INDIA
EFY EXPO WEST
ELECTRONICS ROCKS
EFY Awards
OSIDAYS Expo
Electronics
Infotech
Linux & Open Source
Consumer Electronics
Science & Technology
BPO
EFY Techcenter

Educational Kits
Kitsnspares.com  
 
 
© Copyright 2014 EFY Enterprises Pvt. Ltd.
All rights reserved. Reproduction in whole or in part in any form or medium without written permission is prohibited.
Usage of the content from the web site is subject to Terms and Conditions