Storing every webpage on the internet, books, movies, audio and software digitally and then making it available to everyone is s a challenging task is an understatement. However, that’s what the San Francisco based non-profit the Internet Archive is doing since it was founded. And that, for the record, was in 1996 by Brewster Kahle, after he had sold 2 successful companies: WAIS Inc to AOL in 1995 and Alexa Internet (yup that Alexa) to Amazon in 1996.
Head over to www.archive.org and you can access this library. When it was first founded it only stored a few webpages. Today it has 1,876,584 movies, 2,310,628 audio recordings, 7,481,674 texts from various books. It also has one of the greatest collections of classic software on the planet. As impressive as they are, all of them pale in comparison to the infamous Wayback Machine.
In case you’re lost: the Wayback Machine is the initiative by the Internet Archive to save webpages and ultimately archive the ENTIRE internet from 1996. In other words, the Wayback Machine is an (awesome) internet time machine.
At the time of writing, the Wayback Machine has 452 BILLION webpages saved. Want to see Microsoft.com back in it’s original form in 1996? No problem! Want to know what Google looked like in August 2003? Here’s your answer!
Why is it doing this? Because its mission, they say, is to build the greatest library on Earth.
To use the WayBack Machine, simply head over to www.archive.org/web. Then enter the website you want to see in the search bar and press enter.
You should then be greeted with a calendar like the one shown below. Click on one of the dates highlighted within a blue circle to view a snapshot of what the website looked like on that particular day. To go further back in time, click on a year in the menu on top which has a black bar.
When it comes to dealing with books, videos, audio recordings and software, the Internet Archive does the process of digitizing and adding them to the library manually. When it comes to collecting web pages for the Wayback Machine, things are different. While the option for anyone to upload webpages exists, most of the work is done with web crawlers.
Web crawlers are automated bots that visit a web page. They visit a link, save the resulting web page and the content on it. Once that is done, the crawlers repeat the process all over again for every other link on the web page. Once the website has been saved, the crawlers will revisit it in anywhere between a few weeks to months and grab an updated version of the website. While this is a simple process, it can still take anywhere between 6-14 months after a crawlers visit before a website appears on the Wayback Machine.
There are requirements, though. When it comes to websites, a crawler will only archive it if the site is listed on the Alexa Rankings, not password protected and the site owners have not used the robots exclusion standard. Even if a website meets these requirements, certain content on it may not be archived. This can be due to various reasons – files exceeding the 10MB limit, simply publishers restricting access. Which is why any website archived on the Wayback Machine is considered to be a snapshot.
So how much space does the Wayback Machine need? 9.6 petabytes, as of December 1st 2014. However, as the internet keeps growing at it’s rapid pace, so too does the archive of the Wayback Machine. Currently it’s growing at approximately 20TB each WEEK. That’s like downloading TWENTY THOUSAND 1080p movies every week!
All this data is stored in specially designed servers that store 1 Petabyte called the PetaBox (pictured above) across 4 data centres. One data centre is located in San Francisco itself inside the Internet Archive headquarters itself. The other two data centres are located in Redwood City and Richmond. The fourth data centre would be the modern day library of Alexandria which acts as a backup to ensure that the humanity never loses the Internet Archive library like the original library of Alexandria.
It’s probably safe to say that archiving the internet doesn’t come cheap. Even if it’s a non-profit, the Internet Archive still needs money for everything it does. According to Wikipedia, the Internet Archive has an annual budget of $10 million. So where does it get the money from? Like any good library, there’s a variety of sources:
Despite the Internet Archive having ambitious goals, it’s business model seems to be very simple.
The average Joe may never use the Wayback Machine, except maybe once or twice to satisfy his curiosity by looking at how his favourite websites were like back in the day. However, the average joe was never the target market to begin with! The main users of the WayBack Machine and the Internet Archive in general are: researchers, historians, scholars.
Furthermore, the WayBack machine is just like any other museum or library preserving our history. Take one look at the modern era and you’ll find that a lot of our culture and records of important events are all stored digitally somewhere on the Internet. However, this doesn’t mean it’ll be there forever – because a webpage lasts for only 77 days on average. The Wayback Machine is the keeper of modern history. History those future generations can learn from so that they don’t repeat our mistakes. Especially the design of, say, the Microsoft website back in the day.
How do organizations that conduct
How do organizations that conduct high risk operations do things differently?
Join CAKE LABS and GBG Colombo as we explore the concepts and practices developed by teams that operate in life and death situations.
This session will cover the way leading companies are adopting modes of operation from airlines, the millitary and other teams that operate in contexts where mistakes can cause dire consequences.
Note: Registration required to attend event.
REGISTRATION FORM: http://goo.gl/forms/
(Wednesday) 6:00 pm - 8:00 pm SLST
PCI-DSS Implementation Workshop primarily aimed
PCI-DSS Implementation Workshop primarily aimed at nabling the participant to understand and implement PCI Standards successfully in their organization. Participants gain a clear conception of the various requirements of the PaymentCard Industry Standards, and discover the intent behind each of its requirements, this workshop also helps participants learn how to maintain their PCI compliant status effectively and minimize the possibility of card breach.
The workshop was focused on managers overseeing PCI-DSS compliance, external auditors performing PCI-DSS validation, security professionals operating in a PCI-DSS compliant environment and internal auditors desiring to validate interim compliance.
The course was highly participative and followed a tried and tested format which alternates lecture sessions with practical exercises in breakout groups. It included PCI-DSS background and consequences of non-compliance, scope and overview of all the 12 requirements of PCI DSS, case study and detailed discussion on each requirement, relation between PCI DSS and PA DSS and overview of all 14 requirements of PA DSS with their mapping to PCI-DSS. PCI-DSS Implementation workshop ended with an examination and Certification (CPISI).
Click HERE for registrations!
26 (Thursday) 9:30 am - 27 (Friday) 5:00 am SLST
64, Lotus Road, Colombo
iFest 2k16, organised by
iFest 2k16, organised by ISACA UCSC student group; the only ISACA Student Group in the country, is a conference focused on enhancing awareness in the fields of information security and governance with emphasis on emerging technologies and trends which including internet of things, cloud computing and big data.
The conference will include several sessions conducted by resource people with expertise in each area as well as interactive sessions with the participants.
Millennium Information Technologies – Platinum partner
Hutch – Gold partner
LetMeKnow – Official Photography Partner
ReadMe – Digital Media Partners
(Sunday) 9:00 am - 1:00 pm SLST
University of Colombo - New Arts Theater (NAT)
Kumaratunga Munidasa Mawatha, Colombo
UCSC ISACA Student Groupucsc.email@example.com
Become a UX Ninja is
Become a UX Ninja is a workshop organized by 99X technology to give insights on popular case studies and also to enhance User Experience knowledge .
The speaker, Jostein Magnussen is specialized in digital strategy, customer experience and usability.
Participation: Free of charge
** Limited seats available, Register HERE!
(Monday) 5:00 pm - 7:00 pm SLST
Lakshman Kadirgamar Institute
Horton Place, Colombo 00700
This 3 day training session
This 3 day training session provides you with must-know strategies, tactics and best practices to build a strong foundation for successful performance testing.
Instructor : Tharun Prakash – a qualified software test consultant / trainer with 10+ years of experience in software testing.
Course Outline : JMeter Basics, Simulation of Dynamic User Behaviours, Building Test Plans, Managing Sessions, Load Distribution, Timers, Resource Monitoring, Analysing and Interpreting Load Test Results, Extending JMeter with Bean Shell, External Plug-ins
Standard Course Fee : 18,000 LKR
Early Enrolment Fee : 15,000 LKR (Payments before 20th May)
For registrations visit : Register Now for Apache JMeter!
More Info : Sugandi – 011 2369099 ( Ext 295 )
Email : firstname.lastname@example.org
*A maximum of 02 seats will be allowed per company
6 (Monday) 9:00 am - 8 (Wednesday) 5:00 pm SLST
ICTA, Kirimandala Mawatha, Colombo, Western Province
The ‘Kandy IT/BPM week’, scheduled
The ‘Kandy IT/BPM week’, scheduled to be held from 10th to 12th June 2016, is an industry initiative building regional capacity driven by SLASSCOM, and supported by the Export Development Board (EDB), Information and Communication Technology Agency of Sri Lanka (ICTA), Ministry of Education (MOE) and University of Peradeniya.
Over three days, the Kandy IT/BPM week will feature key Government stakeholders, Sri Lanka’s top IT-BPM industry thought leaders and tech entrepreneurs. It will be a significant event for the promotion of IT/BPM industry in Kandy.
From budding entrepreneurs to keen undergraduates to young school children still choosing a career path, this event will reach out to everyone, and is a part of a series of similar initiatives in other parts of Sri Lanka. Each event, during the three days, will be held at different venues so as to promote different aspects of Kandy and to showcase its diversity as a city.
For more information on any of the events or if you are interested in being a part of this initiative, please contact Ranuka Kariyawasam email@example.com.
10 (Friday) - 12 (Sunday)
SLASSCOMContact No: +94 114 062223-7
JOIN SRI LANKA’S FIRST EVER
JOIN SRI LANKA’S FIRST EVER STARTUP WEEKEND!
Startup Weekend is a global phenomenon – 54 hours of fast and furious prototype development through to exploring potential markets and pitching. It’s an unparalleled opportunity to build lasting relationships with co-founders; mentors, and investors.The real value comes from taking an idea from concept through to execution using Lean tactics and working under high pressure with the best startups.
24 (Friday) 6:30 pm - 26 (Sunday) 9:00 pm SLST
Tilko Jaffna City Hotel
70/6, K.K.S. Road, Jaffna-Kankesanturai Rd Jaffna ,Sri lanka Jaffna, NP 40000
We have to look up your RSVP in order to change itFind my RSVP
We have email-ed you a confirmation to