What is the dark web? How safe is it and how to access it? Your questions answered

Everything you wanted to know about the dark web but were too afraid to ask

presentation deep web

The dark web sounds foreboding. Why else would the police in Brazil, Germany, and the United States need to raid dark web e-shops like the Wall Street Market and the Silk Road? Charging the operators with a long grocery list of crimes ranging stolen data, drugs, and malware . These things do happen on the dark web, but they are one piece of the jigsaw. 

In order to understand the dark web properly, you need to understand that the internet is a huge and sometimes disorganized place. It’s almost like a huge flea market or bazaar. It has billions of sites and addresses, it is amazing that we can both search for – and find – anything. 

There are three basic levels within this complex thing we call the World Wide Web – open, deep, and dark. Each of these have their place – and their drawbacks.

What is the open web?

The open or surface web is what you access daily through search engines like Bing or Google . Before you even turn on the device, search engines have crawled through the web, looking for information, evaluating the sources, and listing your options. 

This is like the general reading room in your local library. The books are there, they’re precisely organized by theme and title, and you’re free and able to look everywhere. By accessing the normal internet, your device is accessing central servers which will then display the website.

If you have time on your hands, you can just wander through the aisles of a library looking at every book. But if you want to find something specific, you can also ask a Librarian to help you locate it. 

Browsers such as Google, Bing , DuckDuckGo act like virtual librarians, sorting and cataloging materials so they can be easily searched. They do this through using “crawlers”, sometimes also known as “spiders” or “robots”. Crawlers can automatically scan websites and their links, then record them. This makes it easy for them (and you) to find websites.

Are you a pro? Subscribe to our newsletter

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

Most corporate and public sites work hard to make sure that these web crawlers can easily find them. This makes perfect sense as the entire purpose of creating a website is so that people can access your content and/or buy your products. Most sites do this by deliberately placing “meta tags” in their website code to make it easier for crawlers to catalog them properly.

Knowing where online materials are – and who is searching for them – makes it possible for search engines like Google to sell advertisements. This accounts for well over 80 percent of the company’s revenue, linking people who are searching with the millions of sites out there that pay Google to list their content.  

Still, this open and cataloged “crawled” web content is still estimated to make up less than 1% of the internet.

What is the deep web?

The term ‘deep web’ doesn’t mean anything nefarious - it’s estimated to make up about 99% of the entire web. It refers to the unindexed web databases and other content that search engines can't crawl through and catalog. The deep web is like an archive, containing an unsorted pile of websites and resources that are largely inaccessible to normal users. 

This could include sites not automatically available to the public, such as those which require a password. Examples of this might be e-mail accounts or registration-only forums. 

There are also millions of servers which only store data which can’t be accessed via a public web page. Data brokers such as LocalBlox for instance crawl the web and store information about business and consumers to sell for marketing purposes. 

Deep sites also  include company intranets and governmental websites, for instance the website of the European Union. You may be able to search such pages but you do so using their own internal search function, not a search engine like Bing or Yahoo. This means content of such sites isn’t accessible to web crawlers.

The deep web also includes most academic content handled directly by universities. Think of this like searching for a library book using the facilities’ own index files – you might have to be in the library to search there. 

What is the dark web?

The dark web, despite massive media attention, is an extremely small part of the deep web.

The term is very general, as there are actually a number of ‘darknets’ available such as ‘Freenet’ and ‘I2P’ but the TOR network has become the most popular. So, when most people refer to the dark net, they mean Tor .

The acronym stands for The Onion Router. A reference to how Tor works; sending encrypted traffic through layers of relays around the globe as it hides content, the sender, and their location. Users need a special browser with added software to access the tor dark web in the first place.

Not only is browsing via tor more secure, it also is more private as it effectively shuts out online trackers. The Tor browser is based on Firefox and makes use of extensions like ‘NoScript’ to prevent harmful code from loading and there’s a built-in ad blocker (see below).

While it is not flawless in protecting user privacy, it works well enough to give users much more privacy in where they go, the content accessed, and protecting their identity and location. The multiple relays help keep some distance and anonymity between the person visiting the website, the website itself, and any entity trying to eavesdrop on the communication between the two. 

Tor is both a type of connection – with the extended relays – and a browser. With your device running a Tor browser, you can go to Tor-specific sites – those with an .onion suffix --  or also visit the usual sites on the open web. The connection between Tor's dark net and the regular internet is bridged via an ‘exit node’. Any internet traffic leaving the exit node is no longer part of Tor's dark web. For maximum security users should only access sites with the .onion suffix via the browser.

Admittedly, there are a number of Tor-only sites for illicit drugs or materials. If used properly, the Tor browser allows surfers to stay anonymous and go to “members only” forums where they can use untraceable cryptocurrencies for their purchases. 

But, that’s not the whole story. There are also popular free legal websites which can be accessed via a .onion address. Facebook offers an onion link to access their services, although you may find logging in difficult, as you’ll most likely appear to be signing in from a different location each time.

Mail providers Mailbox.org and Protonmail can also be accessed via an .onion link. This may be welcome news to those in states where security services have attempted to block ‘anonymous’ email websites like these from the open web. Since Tor can be used to access websites governments try to block, the dark web can be a useful tool for people living under dictatorships to access western media. 

By its nature, Tor is censorship-resistant. Even if such sites were blocked from the regular open web, anyone using the Tor Browser could still access their email using the .onion addresses. 

Image credit: Shutterstock

Dark web: Privacy in a nutshell 

Alexander Vukcevic, head of the Avira Protection Labs, explains: "With the open, deep, and dark web, there is a difference in who can track you. With a usual open web search,the search engine knows where you are, the number of your device, your IP address, and the theme of the search.

“On the deep web, you can assume that activities are monitored at the gateway. The major difference from the open web is that it is system admin -- not the search engine -- that can follow your activities. 

"For the dark web, while some activities can be monitored, you are able to hide your personal data before entering. While you might want to search anonymously, some sites – NYTimes and even those illegal markets – can insist you register so you can be identified. Some open web sites will block you from entering with the Tor browser.”

How many dark web sites are there? 

No one knows precisely how many dark web sites there are out there. Tor is designed to be resistant to web crawling but the number of active ones probably only number in the thousands.

Finding these can prove a challenge, as searching on the dark web can be irritation – visually and operationally. Before finding a treasure of odd substances or private information, you’re likely to hit a number of dead ends. 

Unlike the open web, these sites aren’t really worried about being found by on-page SEO tools like web crawlers. While there are Google-like equivalents trying to categorize the dark web, results are spotty. There are some supposed ‘dark web’ search engines like Torch or Haystak is said to have indexed more .onion sites than any other search engine. But claims like these are hard to prove. 

Part of the reason for this is lack of incentive for content creators on the dark web. Those on Tor aren’t worried about cleaning up their website with the latest SEO tools to boost their relative ranking on the Google and Bing charts. 

Since your connection is routed through multiple tor relays, page loading times can be very slow making effective searching extremely time-consuming. 

The dark net is tiny when compared to both the open and the deep web, estimated to total around 50,000 sites.

Should I visit the Dark Web?

For most of us, the short answer is that there's no reason to: unless you're really paranoid about your privacy or you're doing something that really needs anonymity, such as reporting on repressive regimes or crime syndicates or trying to bypass state censorship, there's no real reason to venture onto the Dark Web at all - not least because it slows down your browsing.

There's a fascinating thread on Reddit (not remotely safe for work) where dark web users share their stories. Some of the tales are enough to make you tape over your webcam and disable your router just in case. Think of it as the dodgy bit of town where sensible people don't go after dark. It’s the wild west.

While in theory you can buy legitimate products and services on the dark web, remember that anonymity works both ways. If you pay for something and it never arrives, you may well not be able to track down the seller to get your money back. This makes the dark web a popular place for scammers. 

Tor

What is Tor?

Tor stands for Thin Onion Routing, and in 2013 UK MP Julian Smith described it as "the black internet where child pornography, drug trafficking and arms trading take place". He's not wrong: 

Tor is where the now-defunct Silk Road drugs marketplace could be found, it's where Black Market Reloaded traded drugs and weapons, and it's where the US National Security Agency says "very naughty people" hang out. It's not the only network on the Dark Web - for example, you may have heard of the Freenet anti-censorship network - but it's by far the most popular.

According to an investigation by Deep Web watchers Vocativ, European terrorists who wanted guns used to "tap into a 20-year-old market that took root and flourished at the end of the Balkan wars. Now with the rise of the dark net, that market has been digitized and deals on illegal guns are only a few minutes away." Many of those deals are from people in the US: Vocativ found 281 listings of guns and ammunition on the dark web, the majority of which were shipping from America.

It's not that Tor is evil; it's just that the same tools that protect political dissidents are pretty good at protecting criminals too.

That wasn't intentional. Tor was initially developed by the US Navy. Its goal was to allow ships to communicate with each other and their bases without revealing their location. It does this by bouncing users' and sites' traffic through multiple relays to disguise where they are.

It's also used by political activists and dissidents, journalists, people who don't trust websites' use of their personal data, and the odd member of the tin foil hat brigade, convinced the government is spying on them at all times.

Whilst using Tor isn’t illegal, the encrypted data packets it uses make it fairly easy to detect. Given its relationship with crime, some ISPs and companies automatically block Tor traffic.

If the dark web’s secret, how does anyone find anything?

For many people, the answer is by using regular websites such as Reddit. Dedicated subreddits guide newcomers around the Dark Web. The moderators enforce a strict policy against posting links to illegal products or services, so you’re more likely to find safer dark web addresses here.

On the open web, there are certain Wikis which are like a kind of Yahoo! for destinations on the Tor network - albeit a Yahoo! where many of the links are likely to land you in prison, which is why we aren't naming or linking to them.

When viewing dark web links, you’ll see that the sites have the .onion extension: that means you need the Tor browser to open them. You'll also see that the majority of sites you can find are marketplaces, because those sites want to attract as many customers as possible. That means they're the tip of the Dark Web iceberg, as many sites are secret and only available to people with the right credentials and/or contacts.

Can I protect my privacy without going onto the dark web? 

You don't need the dark web to protect your identity online.  While Tor is a powerful tool for defending your privacy, it isn't the only one. 

Tor doesn’t protect the data on your device itself, for example. But you can do this through using open-source encryption software such as Veracrypt . Using open-source means there’s far less chance of any security flaws or deliberate backdoors as the code is constantly reviewed by the community.

There are also privacy and anonymous browsers , which are designed to keep you safe on the regular ‘open’ web. For example, the Epic browser is programmed to always run in private mode, so it doesn't store data about which sites you visit. It is based on Chromium, the open-source of Google chrome but the developers claim to have removed all Google tracking software and that the browser stops other companies from tracing you too. 

If you do just want to stop ad networks tracking you, browser plugins such as Ghostery can block trackers. You should also consider installing an ad blocker , which will prevent most harmful or marketing URLs from loading in the first place.

While ad blockers can prevent most harmful links from loading, you should also take steps to protect yourself from malware to keep your data safe from hackers and scammers. Consider installing antivirus software . 

As most malware is designed for Windows, another way to stay safe is to switch to a different operating system . Most versions of Linux such as Ubuntu are free of charge and a the best Linux distros makes it easy to get set up and started in this environment, especially if you’re coming from an OS like Windows. 

VPNs will anonymise your browsing by encrypting the connection between your device and VPN provider. This makes it extremely difficult for your ISP or anyone with access to your internet records to know which sites you visit or apps you use. You can also find a few free VPN services , but be aware of the risks if you're still using legacy VPNs in your organization.

But don't forget the basics, either: if you're dealing with documents that could make you the next Edward Snowden, use an "air gap" - that is, a device that isn't connected to anything else at all. Your data can't be remotely intercepted if you aren't connected to any networks.

Your data could be everywhere

You, or data about you, could already be at all three levels of the internet – and this should concern you. 

For the open web, just type your name into Google and see what comes up. Whether this is a Linkedin profile, Facebook, social media, or any community involvement, chances are that you already have some online presence. 

Your data is almost certainly in the deep internet – and you can only hope that it stays there. This would include doctor records on the hospital intranet or even school records. Your data is being stored, and you can only hope that the companies are keeping it according to GDPR standards, which requires them to keep it safe via various methods like using encryption. 

The cloud has also fueled growth of the deep internet. If a company puts its files on an Amazon web server , it has placed you on the deep web. This is not a privacy issue –  unless they configure the account incorrectly and leave it open to hackers or researchers. 

If that happens, you can only hope that they will inform you in accordance with GDPR procedures and that the data has not been copied and added to a database for sale on the dark web.

You should also consider this if you choose to visit the dark web. The Tor browser can conceal your true location by shunting your traffic through various relays. But it can’t stop you from entering personal information on websites to say where you are. Your connection also may be encrypted but if you do something like send an email from your personal account, then anyone with access to your inbox will know that you were online at that time. 

The dark web can be a dangerous place and may not be for everyone. There are also some excellent ways to protect your privacy from most bad actors. Take some time to decide if this is the right option for you before downloading the Tor browser.

Nate Drake is a tech journalist specializing in cybersecurity and retro tech. He broke out from his cubicle at Apple 6 years ago and now spends his days sipping Earl Grey tea & writing elegant copy.

LLM services are being hit by hackers looking to sell on private info

A new Spectre-esque cyberattack has been found — Intel CPUs under attack once again by encryption-cracking campaign

Love Chrome’s Memory Saver tool? Google will soon give you more control over how aggressive it is

Most Popular

  • 2 NYT Strands today — hints, answers and spangram for Thursday, May 9 (game #67)
  • 3 Quordle today – hints and answers for Thursday, May 9 (game #836)
  • 4 Dell cracks down on hybrid working again — computing giant is going to start color-coding employees to show who is coming back to the office
  • 5 10 things Apple forgot to tell us about the new iPad Pro and iPad Air
  • 2 Best Amazon Singapore deals May 2024: score big discounts on tech, appliances and more
  • 3 Apple iPad event 2024 – 9 things we learned from the Let Loose event
  • 4 Don't fall for the PHEV hype – go battery EV or go home
  • 5 The new Magic Keyboard and Apple Pencil Pro look good, but Apple urgently needs to revisit its single worst accessory

presentation deep web

Understanding the Dark Web

  • First Online: 20 January 2021

Cite this chapter

presentation deep web

  • Dimitrios Kavallieros 5 , 6 ,
  • Dimitrios Myttas 7 ,
  • Emmanouil Kermitsis 7 ,
  • Euthimios Lissaris 7 ,
  • Georgios Giataganas 7 &
  • Eleni Darra 7  

Part of the book series: Security Informatics and Law Enforcement ((SILE))

2881 Accesses

11 Citations

3 Altmetric

This chapter presents the main differences of the surface web, Deep Web and Dark Web as well as their dependences. It further discusses the nature of the Dark Web and the structure of the most prominent darknets, namely, Tor, I2P and Freenet, and provides technical information regarding the technologies behind these darknets. In addition, it discusses the effects police actions on the surface web can have on the Dark Web, the “dilemma” of usage that anonymity technologies present,as well as the challenges LEAs face while trying to prevent and fight against crime and terrorism in the Dark Web.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

M. Aschmann, L. Leenen, J. Jansen van Vuuren, The Utilisation of the Deep Web for Military Counter Terrorist Operations. (Academic Conferences and publishing limited, s.l., 2017)

Google Scholar  

M. Balduzzi, V. Ciancaglini, Cybercrime in the Deep Web (Black Hat Europe, Amsterdam, 2015)

M.J. Barratt, S. Lenton, M. Allen, Internet content regulation, public drug websites and the growth in hidden internet services. Drugs 20 (3), 195–202 (2013)

H. Bin, M. Patel, Z. Zhang, Accessing the deep web: A suervey. Commun. ACM 50 , 94–101 (2007)

Article   Google Scholar  

Bright Planet, Clearing Up Confusion – Deep Web vs. Dark Web. [Online] (2014), Available at: https://brightplanet.com/2014/03/clearing-confusion-deep-web-vs-dark-web/

P. Brink et al., MEDIA4SEC Report on State of the Art Review. (MEDIA4SEC, s.l., 2016)

B. Brown, Threat Advisory: 2016 State of the Dark Web. (AKAMAI Threat Advisory, s.l., 2016)

E. Çalışkan, T. Minárik, A.-M. Osula, Technical and Legal Overview of the Tor Anonymity Network . (NATO Cooperative Cyber Defence Centre of Excellence, s.l., 2015)

I. Clarke, O. Sandberg, B. Wiley, T.W. Hong, Freenet: A distributed anonymous information storage and retrieval system, in Designing Privacy Enhancing Technologies , ed. by H. Federrath, (Springer, Berlin, Heidelberg, 2001), pp. 46–66

Chapter   Google Scholar  

I. Clarke, O. Sandberg, M. Toseland, V. Verendel, Private Communication Through a Network of Trusted Connections: The Dark Freenet . (Network, s.l., 2010)

R. Dingledine, N. Mathewson, S. Paul, Tor: The Second-Generation Onion Router (Naval Research Lab, Washington, DC, 2004)

Book   Google Scholar  

C. Egger, J. Schlumberger, C. Kruegel, G. Vigna, in Practical attacks against the I2P network, ed. by A. S. C. V. W, S. J. Stolfo. Research in Attacks, Intrusions, and Defences (Springer-Verlag Berlin Heidelberg, Berlin, 2013), pp. 432–451

EMCDDA–Europol, EU Drug Markets Report: In-Depth Analysis (European Monitoring Centre for Drugs and Drug Addiction and Europol) (Publications Office of the European Union, Luxembourg, 2016)

H. Erkkonen, J. Larsson, C. Datateknik, Anonymous Networks. (Computer communication and distributed systems, s.l., 2007)

Freenet project, Documentation. [Online] (2018a), Available at: https://freenetproject.org/pages/documentation.html . Accessed on 26 Mar 2018

Freenet project, Freenet Help. [Online] (2018b), Available at: https://freenetproject.org/pages/help.html . Accessed on 26 Feb 2018

M. Griffiths, Monitoring Internet Communications (POST – Parliamentary Office of Science and Technology, London, 2013)

C. Guitton, A review of the available content on Tor hidden services: The case against further development. Comput. Hum. Behav. 29 (6), 2805–2815 (2013)

B. Hawkins, Under The Ocean of the Internet – The Deep Web (SANS Institute-InfoSec Reading Room, 2016)

G. Hussain, E.M. Saltman, Jihad Trending: A Comprehensive Analysis of Online Extremism and How to Counter It (Quilliam, s.l., 2014)

I2P, i2p Tunnel Implementation. [Online] (2018a), Available at: https://geti2p.net/en/docs/tunnels/implementation . Accessed on 23 Mar 2018

I2P, The Network Database. [Online] (2018b), Available at: https://geti2p.net/en/docs/how/network-database . Accessed on 23 Mar 2018

I2P Garlic Routing, Garlic Routing and “Garlic” Terminology. [Online] (2018), Available at: https://geti2p.net/en/docs/how/garlic-routing . Accessed on 22 Feb 2018

V.V. Immonen, Alice in Onion Land: On Information Security of Tor (ITA-SUOMEN YLIOPISTO, s.l., 2016)

E. Jardine, The Dark Web Dilemma: Tor, Anonymity and Online Policing. Global Commission on Internet Governance Paper Series, Band 21 (2015)

C. Nath, T. Kriechbaumer, The Darknet and Online Anonymity (POST – Parliamentary Office of Science and Technology, London, 2015)

H. Neal, Wikimedia commons. [Online] (2008), Available at: https://commons.wikimedia.org/wiki/File:Onion_diagram.svg . Accessed on Apr 2018

C. Sherman, G. Price, The Invisible Web: Uncovering Information Sources Search Engines Can’t See (Information Today, Medford, 2007)

T. Stevens, Regulating the ‘Dark web’: How a two-fold approach can tackle peer-to-peer radicalisation. RUSI J. 154 (2), 28–33 (2009)

J. Strickland, Who owns the Internet? [Online] (2014), Available at: https://computer.howstuffworks.com/internet/basics/who-owns-internet.htm

Syverson P, Basic Course on Onion Routing (U.S. Naval Research Laboratory, s.l., 2015)

The Tor Project, The Legal FAQ for Tor Relay Operators . [Online] (2018a), Available at: https://www.torproject.org/eff/tor-legal-faq.html.en . Accessed on 20 Nov 2019

The Tor Project, Tor Project : FAQ . [Online] (2018b), Available at: https://www.torproject.org/docs/faq.html.en#EntryGuards . Accessed on 20 Nov 2019

The Tor Project, The Solution: A Distributed, Anonymous Network. [Online] (2019a), Available at: https://www.torproject.org/about/overview.html.en#thesolution . Accessed on 21 Mar 2019

The Tor Project, Tor Metrics. [Online] (2019b), Available at: https://metrics.torproject.org/ . Accessed on 20 Nov 2019

The Tor Project, Tor Project. [Online] (2019c), Available at: https://www.torproject.org/ . Accessed on 20 Nov 2019

The Tor Project, Tor: Bridges. [Online] (2019d), Available at: https://www.torproject.org/docs/bridges.html.en . Accessed on 21 Mar 2019

The Tor Project, Tor Relay Guide – Tor Bug Tracker & Wiki. [Online] (2019e), Available at: https://trac.torproject.org/projects/tor/wiki/TorRelayGuide . Accessed on 22 Feb 2019

Tor Challenge, What Is a Tor Relay? [Online] (2018), Available at: https://www.eff.org/torchallenge/what-is-tor.html . Accessed on 21 Mar 2018

Tor Metrics, Relay Search-Flag: Authority. [Online] (2019), Available at: https://metrics.torproject.org/rs.html#search/flag:authority . Accessed on 29 Nov 2019

G. Weimann, Going dark: Terrorism on the dark web. Stud. Confl. Terror. 39 (3), 195–206 (2015)

Download references

Author information

Authors and affiliations.

Center for Security Studies-KEMEA, Athens, Greece

Dimitrios Kavallieros

University of Peloponnese-Department of Informatics and Telecommunications, Tripoli, Greece

Dimitrios Myttas, Emmanouil Kermitsis, Euthimios Lissaris, Georgios Giataganas & Eleni Darra

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Dimitrios Kavallieros .

Editor information

Editors and affiliations.

CENTRIC, Sheffield Hallam University, Sheffield, UK

Babak Akhgar  & Helen Gibson  & 

Cybercrime Research Institute, Köln, Nordrhein-Westfalen, Germany

Marco Gercke

Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece

Stefanos Vrochidis

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Kavallieros, D., Myttas, D., Kermitsis, E., Lissaris, E., Giataganas, G., Darra, E. (2021). Understanding the Dark Web. In: Akhgar, B., Gercke, M., Vrochidis, S., Gibson, H. (eds) Dark Web Investigation. Security Informatics and Law Enforcement. Springer, Cham. https://doi.org/10.1007/978-3-030-55343-2_1

Download citation

DOI : https://doi.org/10.1007/978-3-030-55343-2_1

Published : 20 January 2021

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-55342-5

Online ISBN : 978-3-030-55343-2

eBook Packages : Computer Science Computer Science (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Skip to main content
  • Skip to quick search
  • Skip to global navigation

the journal of electronic publishing

White paper: the deep web: surfacing hidden value.

Permissions : This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact [email protected] for more information.

For more information, read Michigan Publishing's access and usage policy .

This White Paper is a version of the one on the BrightPlanet site. Although it is designed as a marketing tool for a program "for existing Web portals that need to provide targeted, comprehensive information to their site visitors," its insight into the structure of the Web makes it worthwhile reading for all those involved in e-publishing. —J.A.T.

Searching on the Internet today can be compared to dragging a net across the surface of the ocean. While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, missed. The reason is simple: Most of the Web's information is buried far down on dynamically generated sites, and standard search engines never find it.

Traditional search engines create their indices by spidering or crawling surface Web pages. To be discovered, the page must be static and linked to other pages. Traditional search engines can not "see" or retrieve content in the deep Web — those pages do not exist until they are created dynamically as the result of a specific search. Because traditional search engine crawlers can not probe beneath the surface, the deep Web has heretofore been hidden.

The deep Web is qualitatively different from the surface Web. Deep Web sources store their content in searchable databases that only produce results dynamically in response to a direct request. But a direct query is a "one at a time" laborious way to search. BrightPlanet's search technology automates the process of making dozens of direct queries simultaneously using multiple-thread technology and thus is the only search technology, so far, that is capable of identifying, retrieving, qualifying, classifying, and organizing both "deep" and "surface" content.

If the most coveted commodity of the Information Age is indeed information, then the value of deep Web content is immeasurable. With this in mind, BrightPlanet has quantified the size and relevancy of the deep Web in a study based on data collected between March 13 and 30, 2000. Our key findings include:

  • Public information on the deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web.
  • The deep Web contains 7,500 terabytes of information compared to nineteen terabytes of information in the surface Web.
  • The deep Web contains nearly 550 billion individual documents compared to the one billion of the surface Web.
  • More than 200,000 deep Web sites presently exist.
  • Sixty of the largest deep-Web sites collectively contain about 750 terabytes of information — sufficient by themselves to exceed the size of the surface Web forty times.
  • On average, deep Web sites receive fifty per cent greater monthly traffic than surface sites and are more highly linked to than surface sites; however, the typical (median) deep Web site is not well known to the Internet-searching public.
  • The deep Web is the largest growing category of new information on the Internet.
  • Deep Web sites tend to be narrower, with deeper content, than conventional surface sites.
  • Total quality content of the deep Web is 1,000 to 2,000 times greater than that of the surface Web.
  • Deep Web content is highly relevant to every information need, market, and domain.
  • More than half of the deep Web content resides in topic-specific databases.
  • A full ninety-five per cent of the deep Web is publicly accessible information — not subject to fees or subscriptions.

To put these findings in perspective, a study at the NEC Research Institute [1] , published in Nature estimated that the search engines with the largest number of Web pages indexed (such as Google or Northern Light) each index no more than sixteen per cent of the surface Web. Since they are missing the deep Web when they use such search engines, Internet searchers are therefore searching only 0.03% — or one in 3,000 — of the pages available to them today. Clearly, simultaneous searching of multiple surface and deep Web sources is necessary when comprehensive information retrieval is needed.

The Deep Web

Internet content is considerably more diverse and the volume certainly much larger than commonly understood.

First, though sometimes used synonymously, the World Wide Web (HTTP protocol) is but a subset of Internet content. Other Internet protocols besides the Web include FTP (file transfer protocol), e-mail, news, Telnet, and Gopher (most prominent among pre-Web protocols). This paper does not consider further these non-Web protocols. [2]

Second, even within the strict context of the Web, most users are aware only of the content presented to them via search engines such as Excite , Google , AltaVista , or Northern Light , or search directories such as Yahoo! , About.com , or LookSmart . Eighty-five percent of Web users use search engines to find needed information, but nearly as high a percentage cite the inability to find desired information as one of their biggest frustrations. [3] According to a recent survey of search-engine satisfaction by market-researcher NPD, search failure rates have increased steadily since 1997. [4a]

The importance of information gathering on the Web and the central and unquestioned role of search engines — plus the frustrations expressed by users about the adequacy of these engines — make them an obvious focus of investigation.

Until Van Leeuwenhoek first looked at a drop of water under a microscope in the late 1600s, people had no idea there was a whole world of "animalcules" beyond their vision. Deep-sea exploration in the past thirty years has turned up hundreds of strange creatures that challenge old ideas about the origins of life and where it can exist. Discovery comes from looking at the world in new ways and with new tools. The genesis of the BrightPlanet study was to look afresh at the nature of information on the Web and how it is being identified and organized.

How Search Engines Work

Search engines obtain their listings in two ways: Authors may submit their own Web pages, or the search engines "crawl" or "spider" documents by following one hypertext link to another. The latter returns the bulk of the listings. Crawlers work by recording every hypertext link in every page they index crawling. Like ripples propagating across a pond, search-engine crawlers are able to extend their indices further and further from their starting points.

"Whole new classes of Internet-based companies choose the Web as their preferred medium for commerce and information transfer"

The surface Web contains an estimated 2.5 billion documents, growing at a rate of 7.5 million documents per day. [5a] The largest search engines have done an impressive job in extending their reach, though Web growth itself has exceeded the crawling ability of search engines [6a] [7a] Today, the three largest search engines in terms of internally reported documents indexed are Google with 1.35 billion documents (500 million available to most searches), [8] Fast, with 575 million documents [9] and Northern Light with 327 million documents. [10]

Legitimate criticism has been leveled against search engines for these indiscriminate crawls, mostly because they provide too many results (search on "Web," for example, with Northern Light, and you will get about 47 million hits. Also, because new documents are found from links within other documents, those documents that are cited are more likely to be indexed than new documents — up to eight times as likely. [5b]

To overcome these limitations, the most recent generation of search engines (notably Google) have replaced the random link-following approach with directed crawling and indexing based on the "popularity" of pages. In this approach, documents more frequently cross-referenced than other documents are given priority both for crawling and in the presentation of results. This approach provides superior results when simple queries are issued, but exacerbates the tendency to overlook documents with few links. [5c]

And, of course, once a search engine needs to update literally millions of existing Web pages, the freshness of its results suffer. Numerous commentators have noted the increased delay in posting and recording new information on conventional search engines. [11a] Our own empirical tests of search engine currency suggest that listings are frequently three or four months — or more — out of date.

Moreover, return to the premise of how a search engine obtains its listings in the first place, whether adjusted for popularity or not. That is, without a linkage from another Web document, the page will never be discovered. But the main failing of search engines is that they depend on the Web's linkages to identify what is on the Web.

Figure 1 is a graphical representation of the limitations of the typical search engine. The content identified is only what appears on the surface and the harvest is fairly indiscriminate. There is tremendous value that resides deeper than this surface content. The information is there, but it is hiding beneath the surface of the Web.

Searchable Databases: Hidden Value on the Web

How does information appear and get presented on the Web? In the earliest days of the Web, there were relatively few documents and sites. It was a manageable task to post all documents as static pages. Because all pages were persistent and constantly available, they could be crawled easily by conventional search engines. In July 1994, the Lycos search engine went public with a catalog of 54,000 documents. [12] Since then, the compound growth rate in Web documents has been on the order of more than 200% annually! [13a]

Sites that were required to manage tens to hundreds of documents could easily do so by posting fixed HTML pages within a static directory structure. However, beginning about 1996, three phenomena took place. First, database technology was introduced to the Internet through such vendors as Bluestone's Sapphire/Web ( Bluestone has since been bought by HP) and later Oracle. Second, the Web became commercialized initially via directories and search engines, but rapidly evolved to include e-commerce. And, third, Web servers were adapted to allow the "dynamic" serving of Web pages (for example, Microsoft's ASP and the Unix PHP technologies).

This confluence produced a true database orientation for the Web, particularly for larger sites. It is now accepted practice that large data producers such as the U.S. Census Bureau , Securities and Exchange Commission , and Patent and Trademark Office , not to mention whole new classes of Internet-based companies, choose the Web as their preferred medium for commerce and information transfer. What has not been broadly appreciated, however, is that the means by which these entities provide their information is no longer through static pages but through database-driven designs.

It has been said that what cannot be seen cannot be defined, and what is not defined cannot be understood. Such has been the case with the importance of databases to the information content of the Web. And such has been the case with a lack of appreciation for how the older model of crawling static Web pages — today's paradigm for conventional search engines — no longer applies to the information content of the Internet.

In 1994, Dr. Jill Ellsworth first coined the phrase "invisible Web" to refer to information content that was "invisible" to conventional search engines. [14] The potential importance of searchable databases was also reflected in the first search site devoted to them, the AT1 engine that was announced with much fanfare in early 1997. [15] However, PLS, AT1's owner, was acquired by AOL in 1998, and soon thereafter the AT1 service was abandoned.

For this study, we have avoided the term "invisible Web" because it is inaccurate. The only thing "invisible" about searchable databases is that they are not indexable nor able to be queried by conventional search engines. Using BrightPlanet technology, they are totally "visible" to those who need to access them.

Figure 2 represents, in a non-scientific way, the improved results that can be obtained by BrightPlanet technology. By first identifying where the proper searchable databases reside, a directed query can then be placed to each of these sources simultaneously to harvest only the results desired — with pinpoint accuracy.

Additional aspects of this representation will be discussed throughout this study. For the moment, however, the key points are that content in the deep Web is massive — approximately 500 times greater than that visible to conventional search engines — with much higher quality throughout.

BrightPlanet's technology is uniquely suited to tap the deep Web and bring its results to the surface. The simplest way to describe our technology is a "directed-query engine." It has other powerful features in results qualification and classification, but it is this ability to query multiple search sites directly and simultaneously that allows deep Web content to be retrieved.

Study Objectives

To perform the study discussed, we used our technology in an iterative process. Our goal was to:

  • Quantify the size and importance of the deep Web.
  • Characterize the deep Web's content, quality, and relevance to information seekers.
  • Discover automated means for identifying deep Web search sites and directing queries to them.
  • Begin the process of educating the Internet-searching public about this heretofore hidden and valuable information storehouse.

Like any newly discovered phenomenon, the deep Web is just being defined and understood. Daily, as we have continued our investigations, we have been amazed at the massive scale and rich content of the deep Web. This white paper concludes with requests for additional insights and information that will enable us to continue to better understand the deep Web.

What Has Not Been Analyzed or Included in Results

This paper does not investigate non-Web sources of Internet content. This study also purposely ignores private intranet information hidden behind firewalls. Many large companies have internal document stores that exceed terabytes of information. Since access to this information is restricted, its scale can not be defined nor can it be characterized. Also, while on average 44% of the "contents" of a typical Web document reside in HTML and other coded information (for example, XML or Javascript), [16] this study does not evaluate specific information within that code. We do, however, include those codes in our quantification of total content (see next section).

Finally, the estimates for the size of the deep Web include neither specialized search engine sources — which may be partially "hidden" to the major traditional search engines — nor the contents of major search engines themselves. This latter category is significant. Simply accounting for the three largest search engines and average Web document sizes suggests search-engine contents alone may equal 25 terabytes or more [17] or somewhat larger than the known size of the surface Web.

A Common Denominator for Size Comparisons

All deep-Web and surface-Web size figures use both total number of documents (or database records in the case of the deep Web) and total data storage. Data storage is based on "HTML included" Web-document size estimates. [13b] This basis includes all HTML and related code information plus standard text content, exclusive of embedded images and standard HTTP "header" information. Use of this standard convention allows apples-to-apples size comparisons between the surface and deep Web. The HTML-included convention was chosen because:

  • Most standard search engines that report document sizes do so on this same basis.
  • When saving documents or Web pages directly from a browser, the file size byte count uses this convention.
  • BrightPlanet's reports document sizes on this same basis.

All document sizes used in the comparisons use actual byte counts (1024 bytes per kilobyte).

"Estimating total record count per site was often not straightforward"

In actuality, data storage from deep-Web documents will therefore be considerably less than the figures reported. [18] Actual records retrieved from a searchable database are forwarded to a dynamic Web page template that can include items such as standard headers and footers, ads, etc. While including this HTML code content overstates the size of searchable databases, standard "static" information on the surface Web is presented in the same manner.

HTML-included Web page comparisons provide the common denominator for comparing deep and surface Web sources.

Use and Role of BrightPlanet Technology

All retrievals, aggregations, and document characterizations in this study used BrightPlanet's technology. The technology uses multiple threads for simultaneous source queries and then document downloads. It completely indexes all documents retrieved (including HTML content). After being downloaded and indexed, the documents are scored for relevance using four different scoring algorithms, prominently vector space modeling (VSM) and standard and modified extended Boolean information retrieval (EBIR). [19]

Automated deep Web search-site identification and qualification also used a modified version of the technology employing proprietary content and HTML evaluation methods.

Surface Web Baseline

The most authoritative studies to date of the size of the surface Web have come from Lawrence and Giles of the NEC Research Institute in Princeton, NJ. Their analyses are based on what they term the "publicly indexable" Web. Their first major study, published in Science magazine in 1998, using analysis from December 1997, estimated the total size of the surface Web as 320 million documents. [4b] An update to their study employing a different methodology was published in Nature magazine in 1999, using analysis from February 1999. [5d] This study documented 800 million documents within the publicly indexable Web, with a mean page size of 18.7 kilobytes exclusive of images and HTTP headers. [20]

In partnership with Inktomi, NEC updated its Web page estimates to one billion documents in early 2000. [21] We have taken this most recent size estimate and updated total document storage for the entire surface Web based on the 1999 Nature study:

These are the baseline figures used for the size of the surface Web in this paper. (A more recent study from Cyveillance [5e] has estimated the total surface Web size to be 2.5 billion documents, growing at a rate of 7.5 million documents per day. This is likely a more accurate number, but the NEC estimates are still used because they were based on data gathered closer to the dates of our own analysis.)

Other key findings from the NEC studies that bear on this paper include:

  • Surface Web coverage by individual, major search engines has dropped from a maximum of 32% in 1998 to 16% in 1999, with Northern Light showing the largest coverage.
  • Metasearching using multiple search engines can improve retrieval coverage by a factor of 3.5 or so, though combined coverage from the major engines dropped to 42% from 1998 to 1999.
  • More popular Web documents, that is, those with many link references from other documents, have up to an eight-fold greater chance of being indexed by a search engine than those with no link references.

Analysis of Largest Deep Web Sites

More than 100 individual deep Web sites were characterized to produce the listing of sixty sites reported in the next section.

Site characterization required three steps:

  • Estimating the total number of records or documents contained on that site.
  • Retrieving a random sample of a minimum of ten results from each site and then computing the expressed HTML-included mean document size in bytes. This figure, times the number of total site records, produces the total site size estimate in bytes.
  • Indexing and characterizing the search-page form on the site to determine subject coverage.

Estimating total record count per site was often not straightforward. A series of tests was applied to each site and are listed in descending order of importance and confidence in deriving the total document count:

  • E-mail messages were sent to the webmasters or contacts listed for all sites identified, requesting verification of total record counts and storage sizes (uncompressed basis); about 13% of the sites shown in Table 2 provided direct documentation in response to this request.
  • Total record counts as reported by the site itself. This involved inspecting related pages on the site, including help sections, site FAQs, etc.
  • Documented site sizes presented at conferences, estimated by others, etc. This step involved comprehensive Web searching to identify reference sources.
  • Record counts as provided by the site's own search function. Some site searches provide total record counts for all queries submitted. For others that use the NOT operator and allow its stand-alone use, a query term known not to occur on the site such as "NOT ddfhrwxxct" was issued. This approach returns an absolute total record count. Failing these two options, a broad query was issued that would capture the general site content; this number was then corrected for an empirically determined "coverage factor," generally in the 1.2 to 1.4 range [22] .
  • A site that failed all of these tests could not be measured and was dropped from the results listing.

Analysis of Standard Deep Web Sites

Analysis and characterization of the entire deep Web involved a number of discrete tasks:

  • Qualification as a deep Web site.
  • Estimation of total number of deep Web sites.
  • Size analysis.
  • Content and coverage analysis.
  • Site page views and link references.
  • Growth analysis.
  • Quality analysis.

The methods applied to these tasks are discussed separately below.

Deep Web Site Qualification

An initial pool of 53,220 possible deep Web candidate URLs was identified from existing compilations at seven major sites and three minor ones. [23] After harvesting, this pool resulted in 45,732 actual unique listings after tests for duplicates. Cursory inspection indicated that in some cases the subject page was one link removed from the actual search form. Criteria were developed to predict when this might be the case. The BrightPlanet technology was used to retrieve the complete pages and fully index them for both the initial unique sources and the one-link removed sources. A total of 43,348 resulting URLs were actually retrieved.

We then applied a filter criteria to these sites to determine if they were indeed search sites. This proprietary filter involved inspecting the HTML content of the pages, plus analysis of page text content. This brought the total pool of deep Web candidates down to 17,579 URLs.

Subsequent hand inspection of 700 random sites from this listing identified further filter criteria. Ninety-five of these 700, or 13.6%, did not fully qualify as search sites. This correction has been applied to the entire candidate pool and the results presented.

Some of the criteria developed when hand-testing the 700 sites were then incorporated back into an automated test within the BrightPlanet technology for qualifying search sites with what we believe is 98% accuracy. Additionally, automated means for discovering further search sites has been incorporated into our internal version of the technology based on what we learned.

Estimation of Total Number of Sites

The basic technique for estimating total deep Web sites uses "overlap" analysis, the accepted technique chosen for two of the more prominent surface Web size analyses. [6b] [24] We used overlap analysis based on search engine coverage and the deep Web compilation sites noted above (see results in Table 3 through Table 5).

The technique is illustrated in the diagram below:

Overlap analysis involves pairwise comparisons of the number of listings individually within two sources, n a and n b , and the degree of shared listings or overlap, n 0 , between them. Assuming random listings for both n a and n b , the total size of the population, N, can be estimated. The estimate of the fraction of the total population covered by n a is n o /n b ; when applied to the total size of n a an estimate for the total population size can be derived by dividing this fraction into the total size of n a. These pairwise estimates are repeated for all of the individual sources used in the analysis.

To illustrate this technique, assume, for example, we know our total population is 100. Then if two sources, A and B, each contain 50 items, we could predict on average that 25 of those items would be shared by the two sources and 25 items would not be listed by either. According to the formula above, this can be represented as: 100 = 50 / (25/50)

There are two keys to overlap analysis. First, it is important to have a relatively accurate estimate for total listing size for at least one of the two sources in the pairwise comparison. Second, both sources should obtain their listings randomly and independently from one another.

This second premise is in fact violated for our deep Web source analysis. Compilation sites are purposeful in collecting their listings, so their sampling is directed. And, for search engine listings, searchable databases are more frequently linked to because of their information value which increases their relative prevalence within the engine listings. [5f] Thus, the overlap analysis represents a lower bound on the size of the deep Web since both of these factors will tend to increase the degree of overlap, n 0 , reported between the pairwise sources.

Deep Web Size Analysis

In order to analyze the total size of the deep Web, we need an average site size in documents and data storage to use as a multiplier applied to the entire population estimate. Results are shown in Figure 4 and Figure 5.

As discussed for the large site analysis, obtaining this information is not straightforward and involves considerable time evaluating each site. To keep estimation time manageable, we chose a +/- 10% confidence interval at the 95% confidence level, requiring a total of 100 random sites to be fully characterized. [25a]

We randomized our listing of 17,000 search site candidates. We then proceeded to work through this list until 100 sites were fully characterized. We followed a less-intensive process to the large sites analysis for determining total record or document count for the site.

Exactly 700 sites were inspected in their randomized order to obtain the 100 fully characterized sites. All sites inspected received characterization as to site type and coverage; this information was used in other parts of the analysis.

"The invisible portion of the Web will continue to grow exponentially before the tools to uncover the hidden Web are ready for general use"

The 100 sites that could have their total record/document count determined were then sampled for average document size (HTML-included basis). Random queries were issued to the searchable database with results reported as HTML pages. A minimum of ten of these were generated, saved to disk, and then averaged to determine the mean site page size. In a few cases, such as bibliographic databases, multiple records were reported on a single HTML page. In these instances, three total query results pages were generated, saved to disk, and then averaged based on the total number of records reported on those three pages.

Content Coverage and Type Analysis

Content coverage was analyzed across all 17,000 search sites in the qualified deep Web pool (results shown in Table 6); the type of deep Web site was determined from the 700 hand-characterized sites (results shown in Figure 6).

Broad content coverage for the entire pool was determined by issuing queries for twenty top-level domains against the entire pool. Because of topic overlaps, total occurrences exceeded the number of sites in the pool; this total was used to adjust all categories back to a 100% basis.

Hand characterization by search-database type resulted in assigning each site to one of twelve arbitrary categories that captured the diversity of database types. These twelve categories are:

  • Topic Databases — subject-specific aggregations of information, such as SEC corporate filings, medical databases, patent records, etc.
  • Internal site — searchable databases for the internal pages of large sites that are dynamically created, such as the knowledge base on the Microsoft site.
  • Publications — searchable databases for current and archived articles.
  • Shopping/Auction.
  • Classifieds.
  • Portals — broader sites that included more than one of these other categories in searchable databases.
  • Library — searchable internal holdings, mostly for university libraries.
  • Yellow and White Pages — people and business finders.
  • Calculators — while not strictly databases, many do include an internal data component for calculating results. Mortgage calculators, dictionary look-ups, and translators between languages are examples.
  • Jobs — job and resume postings.
  • Message or Chat .
  • General Search — searchable databases most often relevant to Internet search topics and information.

These 700 sites were also characterized as to whether they were public or subject to subscription or fee access.

Site Pageviews and Link References

Netscape's "What's Related" browser option, a service from Alexa, provides site popularity rankings and link reference counts for a given URL. [26a] About 71% of deep Web sites have such rankings. The universal power function (a logarithmic growth rate or logarithmic distribution) allows pageviews per month to be extrapolated from the Alexa popularity rankings. [27] The "What's Related" report also shows external link counts to the given URL.

A random sampling for each of 100 deep and surface Web sites for which complete "What's Related" reports could be obtained were used for the comparisons.

Growth Analysis

The best method for measuring growth is with time-series analysis. However, since the discovery of the deep Web is so new, a different gauge was necessary.

Whois [28] searches associated with domain-registration services [25b] return records listing domain owner, as well as the date the domain was first obtained (and other information). Using a random sample of 100 deep Web sites [26b] and another sample of 100 surface Web sites [29] we issued the domain names to a Whois search and retrieved the date the site was first established. These results were then combined and plotted for the deep vs. surface Web samples.

Quality Analysis

Quality comparisons between the deep and surface Web content were based on five diverse, subject-specific queries issued via the BrightPlanet technology to three search engines (AltaVista, Fast, Northern Light) [30] and three deep sites specific to that topic and included in the 600 sites presently configured for our technology. The five subject areas were agriculture, medicine, finance/business, science, and law.

The queries were specifically designed to limit total results returned from any of the six sources to a maximum of 200 to ensure complete retrieval from each source. [31] The specific technology configuration settings are documented in the endnotes. [32]

The "quality" determination was based on an average of our technology's VSM and mEBIR computational linguistic scoring methods. [33] [34] The "quality" threshold was set at our score of 82, empirically determined as roughly accurate from millions of previous scores of surface Web documents.

Deep Web vs. surface Web scores were obtained by using the BrightPlanet technology's selection by source option and then counting total documents and documents above the quality scoring threshold.

Results and Discussion

This study is the first known quantification and characterization of the deep Web. Very little has been written or known of the deep Web. Estimates of size and importance have been anecdotal at best and certainly underestimate scale. For example, Intelliseek's "invisible Web" says that, "In our best estimates today, the valuable content housed within these databases and searchable sources is far bigger than the 800 million plus pages of the 'Visible Web.'" They also estimate total deep Web sources at about 50,000 or so. [35]

Ken Wiseman, who has written one of the most accessible discussions about the deep Web, intimates that it might be about equal in size to the known Web. He also goes on to say, "I can safely predict that the invisible portion of the Web will continue to grow exponentially before the tools to uncover the hidden Web are ready for general use." [36] A mid-1999 survey by About.com's Web search guide concluded the size of the deep Web was "big and getting bigger." [37] A paper at a recent library science meeting suggested that only "a relatively small fraction of the Web is accessible through search engines." [38]

The deep Web is about 500 times larger than the surface Web, with, on average, about three times higher quality based on our document scoring methods on a per-document basis. On an absolute basis, total deep Web quality exceeds that of the surface Web by thousands of times. Total number of deep Web sites likely exceeds 200,000 today and is growing rapidly. [39] Content on the deep Web has meaning and importance for every information seeker and market. More than 95% of deep Web information is publicly available without restriction. The deep Web also appears to be the fastest growing information component of the Web.

General Deep Web Characteristics

Deep Web content has some significant differences from surface Web content. Deep Web documents (13.7 KB mean size; 19.7 KB median size) are on average 27% smaller than surface Web documents. Though individual deep Web sites have tremendous diversity in their number of records, ranging from tens or hundreds to hundreds of millions (a mean of 5.43 million records per site but with a median of only 4,950 records), these sites are on average much, much larger than surface sites. The rest of this paper will serve to amplify these findings.

The mean deep Web site has a Web-expressed (HTML-included basis) database size of 74.4 MB (median of 169 KB). Actual record counts and size estimates can be derived from one-in-seven deep Web sites.

On average, deep Web sites receive about half again as much monthly traffic as surface sites (123,000 pageviews per month vs. 85,000). The median deep Web site receives somewhat more than two times the traffic of a random surface Web site (843,000 monthly pageviews vs. 365,000). Deep Web sites on average are more highly linked to than surface sites by nearly a factor of two (6,200 links vs. 3,700 links), though the median deep Web site is less so (66 vs. 83 links). This suggests that well-known deep Web sites are highly popular, but that the typical deep Web site is not well known to the Internet search public.

One of the more counter-intuitive results is that 97.4% of deep Web sites are publicly available without restriction; a further 1.6% are mixed (limited results publicly available with greater results requiring subscription and/or paid fees); only 1.1% of results are totally subscription or fee limited. This result is counter intuitive because of the visible prominence of subscriber-limited sites such as Dialog, Lexis-Nexis, Wall Street Journal Interactive, etc. (We got the document counts from the sites themselves or from other published sources.)

However, once the broader pool of deep Web sites is looked at beyond the large, visible, fee-based ones, public availability dominates.

60 Deep Sites Already Exceed the Surface Web by 40 Times

Table 2 indicates that the sixty known, largest deep Web sites contain data of about 750 terabytes (HTML-included basis) or roughly forty times the size of the known surface Web. These sites appear in a broad array of domains from science to law to images and commerce. We estimate the total number of records or documents within this group to be about eighty-five billion.

Roughly two-thirds of these sites are public ones, representing about 90% of the content available within this group of sixty. The absolutely massive size of the largest sites shown also illustrates the universal power function distribution of sites within the deep Web, not dissimilar to Web site popularity [40] or surface Web sites. [41] One implication of this type of distribution is that there is no real upper size boundary to which sites may grow.

This listing is preliminary and likely incomplete since we lack a complete census of deep Web sites.

Our inspection of the 700 random-sample deep Web sites identified another three that were not in the initially identified pool of 100 potentially large sites. If that ratio were to hold across the entire estimated 200,000 deep Web sites (see next table), perhaps only a very small percentage of sites shown in this table would prove to be the largest. However, since many large sites are anecdotally known, we believe our listing, while highly inaccurate, may represent 10% to 20% of the actual largest deep Web sites in existence.

This inability to identify all of the largest deep Web sites today should not be surprising. The awareness of the deep Web is a new phenomenon and has received little attention. We solicit nominations for additional large sites on our comprehensive CompletePlanet site and will document new instances as they arise.

Deep Web is 500 Times Larger than the Surface Web

We employed three types of overlap analysis to estimate the total numbers of deep Web sites. In the first approach, shown in Table 3, we issued 100 random deep Web URLs from our pool of 17,000 to the search engines that support URL search. These results, with the accompanying overlap analysis, are:

This table shows greater diversity in deep Web site estimates as compared to normal surface Web overlap analysis. We believe the reasons for this variability are: 1) the relatively small sample size matched against the engines; 2) the high likelihood of inaccuracy in the baseline for total deep Web database sizes from Northern Light [42] ; and 3) the indiscriminate scaling of Fast and AltaVista deep Web site coverage based on the surface ratios of these engines to Northern Light. As a result, we have little confidence in these results.

An alternate method is to compare NEC reported values [5g] for surface Web coverage to the reported deep Web sites from the Northern Light engine. These numbers were further adjusted by the final qualification fraction obtained from our hand scoring of 700 random deep Web sites. These results are shown below:

This approach, too, suffers from the limitations of using the Northern Light deep Web site baseline. It is also unclear, though likely, that deep Web search coverage is more highly represented in the search engines' listing as discussed above.

Our third approach is more relevant and is shown in Table 5.

Under this approach, we use overlap analysis for the three largest compilation sites for deep Web sites used to build our original 17,000 qualified candidate pool. To our knowledge, these are the three largest listings extant, excepting our own CompletePlanet site.

This approach has the advantages of:

  • providing an absolute count of sites
  • ensuring final qualification as to whether the sites are actually deep Web search sites
  • relatively large sample sizes.

Because each of the three compilation sources has a known population, the table shows only three pairwise comparisons ( e.g., there is no uncertainty in the ultimate A or B population counts).

As discussed above, there is certainly sampling bias in these compilations since they were purposeful and not randomly obtained. Despite this, there is a surprising amount of uniqueness among the compilations.

The Lycos and Internets listings are more similar in focus in that they are commercial sites. The Infomine site was developed from an academic perspective. For this reason, we adjudge the Lycos-Infomine pairwise comparison to be most appropriate. Though sampling was directed for both sites, the intended coverage and perspective is different.

There is obviously much uncertainty in these various tables. Because of lack of randomness, these estimates are likely at the lower bounds for the number of deep Web sites. Across all estimating methods the mean estimate for number of deep Web sites is about 76,000, with a median of about 56,000. For the searchable database compilation only, the average is about 70,000.

The under count due to lack of randomness and what we believe to be the best estimate above, namely the Lycos-Infomine pair, indicate to us that the ultimate number of deep Web sites today is on the order of 200,000.

Plotting the fully characterized random 100 deep Web sites against total record counts produces Figure 4. Plotting these same sites against database size (HTML-included basis) produces Figure 5.

Multiplying the mean size of 74.4 MB per deep Web site times a total of 200,000 deep Web sites results in a total deep Web size projection of 7.44 petabytes, or 7,440 terabytes. [43] [44a] Compared to the current surface Web content estimate of 18.7 TB (see Table 1), this suggests a deep Web size about 400 times larger than the surface Web. Even at the lowest end of the deep Web size estimates in Table 3 through Table 5, the deep Web size calculates as 120 times larger than the surface Web. At the highest end of the estimates, the deep Web is about 620 times the size of the surface Web.

Alternately, multiplying the mean document/record count per deep Web site of 5.43 million times 200,000 total deep Web sites results in a total record count across the deep Web of 543 billion documents. [44b] Compared to the Table 1 estimate of one billion documents, this implies a deep Web 550 times larger than the surface Web. At the low end of the deep Web size estimate this factor is 170 times; at the high end, 840 times.

Clearly, the scale of the deep Web is massive, though uncertain. Since 60 deep Web sites alone are nearly 40 times the size of the entire surface Web, we believe that the 200,000 deep Web site basis is the most reasonable one. Thus, across database and record sizes, we estimate the deep Web to be about 500 times the size of the surface Web.

Deep Web Coverage is Broad, Relevant

Table 6 represents the subject coverage across all 17,000 deep Web sites used in this study. These subject areas correspond to the top-level subject structure of the CompletePlanet site. The table shows a surprisingly uniform distribution of content across all areas, with no category lacking significant representation of content. Actual inspection of the CompletePlanet site by node shows some subjects are deeper and broader than others. However, it is clear that deep Web content also has relevance to every information need and market.

Figure 6 displays the distribution of deep Web sites by type of content.

More than half of all deep Web sites feature topical databases. Topical databases plus large internal site documents and archived publications make up nearly 80% of all deep Web sites. Purchase-transaction sites — including true shopping sites with auctions and classifieds — account for another 10% or so of sites. The other eight categories collectively account for the remaining 10% or so of sites.

Deep Web is Higher Quality

"Quality" is subjective: If you get the results you desire, that is high quality; if you don't, there is no quality at all.

When BrightPlanet assembles quality results for its Web-site clients, it applies additional filters and tests to computational linguistic scoring. For example, university course listings often contain many of the query terms that can produce high linguistic scores, but they have little intrinsic content value unless you are a student looking for a particular course. Various classes of these potential false positives exist and can be discovered and eliminated through learned business rules.

Our measurement of deep vs. surface Web quality did not apply these more sophisticated filters. We relied on computational linguistic scores alone. We also posed five queries across various subject domains. Using only computational linguistic scoring does not introduce systematic bias in comparing deep and surface Web results because the same criteria are used in both. The relative differences between surface and deep Web should maintain, even though the absolute values are preliminary and will overestimate "quality." The results of these limited tests are shown in Table 7.

This table shows that there is about a three-fold improved likelihood for obtaining quality results from the deep Web as from the surface Web on average for the limited sample set. Also, the absolute number of results shows that deep Web sites tend to return 10% more documents than surface Web sites and nearly triple the number of quality documents. While each query used three of the largest and best search engines and three of the best known deep Web sites, these results are somewhat misleading and likely underestimate the "quality" difference between the surface and deep Web. First, there are literally hundreds of applicable deep Web sites for each query subject area. Some of these additional sites would likely not return as high an overall quality yield, but would add to the total number of quality results returned. Second, even with increased numbers of surface search engines, total surface coverage would not go up significantly and yields would decline, especially if duplicates across all search engines were removed (as they should be). And, third, we believe the degree of content overlap between deep Web sites to be much less than for surface Web sites.(45) Though the quality tests applied in this study are not definitive, we believe they point to a defensible conclusion that quality is many times greater for the deep Web than for the surface Web. Moreover, the deep Web has the prospect of yielding quality results that cannot be obtained by any other means, with absolute numbers of quality results increasing as a function of the number of deep Web sites simultaneously searched. The deep Web thus appears to be a critical source when it is imperative to find a "needle in a haystack."

Deep Web Growing Faster than Surface Web

Lacking time-series analysis, we used the proxy of domain registration date to measure the growth rates for each of 100 randomly chosen deep and surface Web sites. These results are presented as a scattergram with superimposed growth trend lines in Figure 7.

Use of site domain registration as a proxy for growth has a number of limitations. First, sites are frequently registered well in advance of going "live." Second, the domain registration is at the root or domain level (e.g., www.mainsite.com ). The search function and page — whether for surface or deep sites — often is introduced after the site is initially unveiled and may itself reside on a subsidiary form not discoverable by the whois analysis.

The best way to test for actual growth is a time series analysis. BrightPlanet plans to institute such tracking mechanisms to obtain better growth estimates in the future.

However, this limited test does suggest faster growth for the deep Web. Both median and average deep Web sites are four or five months "younger" than surface Web sites (Mar. 95 v. Aug. 95). This is not surprising. The Internet has become the preferred medium for public dissemination of records and information, and more and more information disseminators (such as government agencies and major research projects) that have enough content to qualify as deep Web are moving their information online. Moreover, the technology for delivering deep Web sites has been around for a shorter period of time.

Thousands of Conventional Search Engines Remain Undiscovered

While we have specifically defined the deep Web to exclude search engines (see next section), many specialized search engines, such as those shown in Table 8 below or @griculture.com , AgriSurf , or joefarmer [formerly http://www.joefarmer.com/] in the agriculture domain, provide unique content not readily indexed by major engines such as AltaVista, Fast or Northern Light. The key reasons that specialty search engines may contain information not on the major ones are indexing frequency and limitations the major search engines may impose on documents indexed per site. [11b]

To find out whether the specialty search engines really do offer unique information, we used similar retrieval and qualification methods on them — pairwise overlap analysis — in a new investigation. The results of this analysis are shown in the table below.

These results suggest there may be on the order of 20,000 to 25,000 total search engines currently on the Web. (Recall that all of our deep Web analysis excludes these additional search engine sites.)

M. Hofstede, of the Leiden University Library in the Netherlands, reports that one compilation alone contains nearly 45,000 search site listings. [46] Thus, our best current estimate is that deep Web searchable databases and search engines have a combined total of 250,000 sites. Whatever the actual number proves to be, comprehensive Web search strategies should include the specialty search engines as well as deep Web sites. Thus, BrightPlanet's CompletePlanet Web site also includes specialty search engines in its listings.

The most important findings from our analysis of the deep Web are that there is massive and meaningful content not discoverable with conventional search technology and that there is a nearly uniform lack of awareness that this critical content even exists.

Original Deep Content Now Exceeds All Printed Global Content

International Data Corporation predicts that the number of surface Web documents will grow from the current two billion or so to 13 billion within three years, a factor increase of 6.5 times; [47] deep Web growth should exceed this rate, perhaps increasing about nine-fold over the same period. Figure 8 compares this growth with trends in the cumulative global content of print information drawn from a recent UC Berkeley study. [48a]

The total volume of printed works (books, journals, newspapers, newsletters, office documents) has held steady at about 390 terabytes (TBs). [48b] By about 1998, deep Web original information content equaled all print content produced through history up until that time. By 2000, original deep Web content is estimated to have exceeded print by a factor of seven and is projected to exceed print content by a factor of sixty three by 2003.

Other indicators point to the deep Web as the fastest growing component of the Web and will continue to dominate it. [49] Even today, at least 240 major libraries have their catalogs on line; [50] UMI, a former subsidiary of Bell & Howell, has plans to put more than 5.5 billion document images online; [51] and major astronomy data initiatives are moving toward putting petabytes of data online. [52]

These trends are being fueled by the phenomenal growth and cost reductions in digital, magnetic storage. [48c] [53] International Data Corporation estimates that the amount of disk storage capacity sold annually grew from 10,000 terabytes in 1994 to 116,000 terabytes in 1998, and it is expected to increase to 1,400,000 terabytes in 2002. [54] Deep Web content accounted for about 1/338 th of magnetic storage devoted to original content in 2000; it is projected to increase to 1/200 th by 2003. As the Internet is expected to continue as the universal medium for publishing and disseminating information, these trends are sure to continue.

The Gray Zone

There is no bright line that separates content sources on the Web. There are circumstances where "deep" content can appear on the surface, and, especially with specialty search engines, when "surface" content can appear to be deep.

Surface Web content is persistent on static pages discoverable by search engines through crawling, while deep Web content is only presented dynamically in response to a direct request. However, once directly requested, deep Web content comes associated with a URL, most often containing the database record number, that can be re-used later to obtain the same document.

We can illustrate this point using one of the best searchable databases on the Web, 10Kwizard . 10Kwizard provides full-text searching of SEC corporate filings [55] . We issued a query on "NCAA basketball" with a restriction to review only annual filings filed between March 1999 and March 2000. One result was produced for Sportsline USA, Inc. Clicking on that listing produces full-text portions for the query string in that annual filing. With another click, the full filing text can also be viewed. The URL resulting from this direct request is:

http://www.10kwizard.com/blurbs.php?repo=tenk & ipage=1067295 & exp=%22ncaa+basketball%22 & g=

Note two things about this URL. First, our query terms appear in it. Second, the "ipage=" shows a unique record number, in this case 1067295. It is via this record number that the results are served dynamically from the 10KWizard database.

Now, if we were doing comprehensive research on this company and posting these results on our own Web page, other users could click on this URL and get the same information. Importantly, if we had posted this URL on a static Web page, search engine crawlers could also discover it, use the same URL as shown above, and then index the contents.

It is by doing searches and making the resulting URLs available that deep content can be brought to the surface. Any deep content listed on a static Web page is discoverable by crawlers and therefore indexable by search engines. As the next section describes, it is impossible to completely "scrub" large deep Web sites for all content in this manner. But it does show why some deep Web content occasionally appears on surface Web search engines.

This gray zone also encompasses surface Web sites that are available through deep Web sites. For instance, the Open Directory Project , is an effort to organize the best of surface Web content using voluntary editors or "guides." [56] The Open Directory looks something like Yahoo!; that is, it is a tree structure with directory URL results at each branch. The results pages are static, laid out like disk directories, and are therefore easily indexable by the major search engines.

The Open Directory claims a subject structure of 248,000 categories, [57] each of which is a static page. [58] The key point is that every one of these 248,000 pages is indexable by major search engines.

Four major search engines with broad surface coverage allow searches to be specified based on URL. The query "URL:dmoz.org" (the address for the Open Directory site) was posed to these engines with these results:

Although there are almost 250,000 subject pages at the Open Directory site, only a tiny percentage are recognized by the major search engines. Clearly the engines' search algorithms have rules about either depth or breadth of surface pages indexed for a given site. We also found a broad variation in the timeliness of results from these engines. Specialized surface sources or engines should therefore be considered when truly deep searching is desired. That bright line between deep and surface Web shows is really shades of gray.

The Impossibility of Complete Indexing of Deep Web Content

Consider how a directed query works: specific requests need to be posed against the searchable database by stringing together individual query terms (and perhaps other filters such as date restrictions). If you do not ask the database specifically what you want, you will not get it.

Let us take, for example, our own listing of 38,000 deep Web sites. Within this compilation, we have some 430,000 unique terms and a total of 21,000,000 terms. If these numbers represented the contents of a searchable database, then we would have to issue 430,000 individual queries to ensure we had comprehensively "scrubbed" or obtained all records within the source database. Our database is small compared to some large deep Web databases. For example, one of the largest collections of text terms is the British National Corpus containing more than 100 million unique terms. [59]

It is infeasible to issue many hundreds of thousands or millions of direct queries to individual deep Web search databases. It is implausible to repeat this process across tens to hundreds of thousands of deep Web sites. And, of course, because content changes and is dynamic, it is impossible to repeat this task on a reasonable update schedule. For these reasons, the predominant share of the deep Web content will remain below the surface and can only be discovered within the context of a specific information request.

Possible Double Counting

Web content is distributed and, once posted, "public" to any source that chooses to replicate it. How much of deep Web content is unique, and how much is duplicated? And, are there differences in duplicated content between the deep and surface Web?

"Surface Web sites are fraught with quality problems"

This study was not able to resolve these questions. Indeed, it is not known today how much duplication occurs within the surface Web.

Observations from working with the deep Web sources and data suggest there are important information categories where duplication does exist. Prominent among these are yellow/white pages, genealogical records, and public records with commercial potential such as SEC filings. There are, for example, numerous sites devoted to company financials.

On the other hand, there are entire categories of deep Web sites whose content appears uniquely valuable. These mostly fall within the categories of topical databases, publications, and internal site indices — accounting in total for about 80% of deep Web sites — and include such sources as scientific databases, library holdings, unique bibliographies such as PubMed, and unique government data repositories such as satellite imaging data and the like.

But duplication is also rampant on the surface Web. Many sites are "mirrored." Popular documents are frequently appropriated by others and posted on their own sites. Common information such as book and product listings, software, press releases, and so forth may turn up multiple times on search engine searches. And, of course, the search engines themselves duplicate much content.

Duplication potential thus seems to be a function of public availability, market importance, and discovery. The deep Web is not as easily discovered, and while mostly public, not as easily copied by other surface Web sites. These factors suggest that duplication may be lower within the deep Web. But, for the present, this observation is conjecture.

Deep vs. Surface Web Quality

The issue of quality has been raised throughout this study. A quality search result is not a long list of hits, but the right list. Searchers want answers. Providing those answers has always been a problem for the surface Web, and without appropriate technology will be a problem for the deep Web as well.

Effective searches should both identify the relevant information desired and present it in order of potential relevance — quality. Sometimes what is most important is comprehensive discovery — everything referring to a commercial product, for instance. Other times the most authoritative result is needed — the complete description of a chemical compound, as an example. The searches may be the same for the two sets of requirements, but the answers will have to be different. Meeting those requirements is daunting, and knowing that the deep Web exists only complicates the solution because it often contains useful information for either kind of search. If useful information is obtainable but excluded from a search, the requirements of either user cannot be met.

We have attempted to bring together some of the metrics included in this paper, [60] defining quality as both actual quality of the search results and the ability to cover the subject.

These strict numerical ratios ignore that including deep Web sites may be the critical factor in actually discovering the information desired. In terms of discovery, inclusion of deep Web sites may improve discovery by 600 fold or more.

Surface Web sites are fraught with quality problems. For example, a study in 1999 indicated that 44% of 1998 Web sites were no longer available in 1999 and that 45% of existing sites were half-finished, meaningless, or trivial. [61] Lawrence and Giles' NEC studies suggest that individual major search engine coverage dropped from a maximum of 32% in 1998 to 16% in 1999. [7b]

Peer-reviewed journals and services such as Science Citation Index have evolved to provide the authority necessary for users to judge the quality of information. The Internet lacks such authority.

An intriguing possibility with the deep Web is that individual sites can themselves establish that authority. For example, an archived publication listing from a peer-reviewed journal such as Nature or Science or user-accepted sources such as the Wall Street Journal or The Economist carry with them authority based on their editorial and content efforts. The owner of the site vets what content is made available. Professional content suppliers typically have the kinds of database-based sites that make up the deep Web; the static HTML pages that typically make up the surface Web are less likely to be from professional content suppliers.

By directing queries to deep Web sources, users can choose authoritative sites. Search engines, because of their indiscriminate harvesting, do not direct queries. By careful selection of searchable sites, users can make their own determinations about quality, even though a solid metric for that value is difficult or impossible to assign universally.

Serious information seekers can no longer avoid the importance or quality of deep Web information. But deep Web information is only a component of total information available. Searching must evolve to encompass the complete Web.

Directed query technology is the only means to integrate deep and surface Web information. The information retrieval answer has to involve both "mega" searching of appropriate deep Web sites and "meta" searching of surface Web search engines to overcome their coverage problem. Client-side tools are not universally acceptable because of the need to download the tool and issue effective queries to it. [62] Pre-assembled storehouses for selected content are also possible, but will not be satisfactory for all information requests and needs. Specific vertical market services are already evolving to partially address these challenges. [63] These will likely need to be supplemented with a persistent query system customizable by the user that would set the queries, search sites, filters, and schedules for repeated queries.

These observations suggest a splitting within the Internet information search market: search directories that offer hand-picked information chosen from the surface Web to meet popular search needs; search engines for more robust surface-level searches; and server-side content-aggregation vertical "infohubs" for deep Web information to provide answers where comprehensiveness and quality are imperative.

Michael K. Bergman is chairman and VP, products and technology of BrightPlanet Corporation, a Sioux Falls, SD automated Internet content-aggregation service. Although he trained for a Ph.D. in population genetics at Duke University, he has been involved in Internet and database-software ventures for the last decade. He was chairman of The WebTools Co., and is president and chairman of VisualMetrics Corporation in Iowa City, IA, which developed a genome informatics data system. He has frequently testified before the U.S. Congress on technology and commercialization issues, and has been a keynote or invited speaker at more than 80 national industry meetings. He is also the author of BrightPlanet's award-winning "Tutorial: A Guide to Effective Searching of the Internet." http://completeplanet.com/Tutorials/Search/index.asp . You may reach him by e-mail at [email protected] .

  • AlphaSearch — [formerly http://www.calvin.edu/library/searreso/internet/as/]
  • Direct Search — http://www.freepint.com/gary/direct.htmdirect.htm
  • Infomine Multiple Database Search — http://infomine.ucr.edu/
  • The BigHub (formerly Internet Sleuth) — [formerly http://www.thebighub.com/]
  • Lycos Searchable Databases — [formerly http://dir.lycos.com/Reference/Searchable_Databases/]
  • Internets (Search Engines and News) — [formerly http://www.internets.com/]
  • HotSheet — http://www.hotsheet.com
  • Plus minor listings from three small sites.

45. We have not empirically tested this assertion in this study. However, from a logical standpoint, surface search engines are all indexing ultimately the same content, namely the public indexable Web. Deep Web sites reflect information from different domains and producers.

Some of the information in this document is preliminary. BrightPlanet plans future revisions as better information and documentation is obtained. We welcome submission of improved information and statistics from others involved with the Deep Web. © Copyright BrightPlanet Corporation. This paper is the property of BrightPlanet Corporation. Users are free to copy and distribute it for personal use.

Michael K. Bergman may be reached by e-mail at [email protected] .

Links from this article:

10Kwizard http://www.10kwizard.com

About.com http://www.about.com/

Agriculture.com http://www.agriculture.com/

AgriSurf http://www.agrisurf.com/agrisurfscripts/agrisurf.asp?index=_25

AltaVista http://www.altavista.com/

Bluestone [formerly http://www.bluestone.com]

Excite http://www.excite.com

Google http://www.google.com/

joefarmer [formerly http://www.joefarmer.com/]

LookSmart http://www.looksmart.com/

Northern Light http://www.northernlight.com/

Open Directory Project http://dmoz.org

Oracle http://www.oracle.com/

Patent and Trademark Office http://www.uspto.gov

Securities and Exchange Commission http://www.sec.gov

U.S. Census Bureau http://www.census.gov

Whois http://www.whois.net

Yahoo! http://www.yahoo.com/

Product of Michigan Publishing , University of Michigan Library • [email protected] • ISSN 1080-2711

Topics For Seminar

  • Computer Science
  • Deep net ppt
  • Deep Web ppt
  • Hidden Web ppt
  • Invisible Web ppt
  • presentation topics
  • search engines

The Deep Web: PPT Presentation (Download)

  • Share to Facebook
  • Share to Twitter

deep web ppt seminar report download

1. The Deep Web PPT

Share this article, subscribe via email, related post.

  • Like on Facebook
  • Follow on Twitter
  • Follow on Slideshare
  • Follow on Pinterest
  • Subscribe on Youtube

Trending Seminar Topics

  • 100+ Seminar Topics for Youth, Teenagers, College Students Young people are on a never-ending quest for transcendence, which drives them to want to improve the environment, countries, communities,...
  • 30+ Technical Seminar Topics for Presentation: Latest Tech Trends Technology is rapidly evolving today, allowing for faster change and progress and accelerating the rate of change. However, it is not just t...
  • 100 PowerPoint Presentation Topics in Hindi (Download PPT) विद्यार्थियों के लिए प्रेजेंटेशन का महत्व प्रेजेंटेशन (presentation) देना शैक्षणिक पाठ्यक्रम का एक महत्वपूर्ण व्यावहारिक पाठ्यक्रम है, ...
  • 100+ Interesting Biology Presentation Topics with PPT Biology Topics for Presentation & Research Biology is a topic that every school student studies and university student who does major in...
  • 100 Interesting Fun Topics for Presentations Fun Topics for Presentations We have prepared for you a fantastic collection of fun topics for presentation with relevant links to the artic...

Recent Seminar Topics

Seminar topics.

  • 💻 Seminar Topics for CSE Computer Science Engineering
  • ⚙️ Seminar Topics for Mechanical Engineering ME
  • 📡 Seminar Topics for ECE Electronics and Communication
  • ⚡️ Seminar Topics for Electrical Engineering EEE
  • 👷🏻 Seminar Topics for Civil Engineering
  • 🏭 Seminar Topics for Production Engineering
  • 💡 Physics Seminar Topics
  • 🌎 Seminar Topics for Environment
  • ⚗️ Chemistry Seminar Topics
  • 📈 Business Seminar Topics
  • 👦🏻 Seminar Topics for Youth

Investigatory Projects Topics

  • 👨🏻‍🔬 Chemistry Investigatory Projects Topics
  • 📧 Contact Us For Seminar Topics
  • 👉🏼Follow us in Slideshare

Presentation Topics

  • 🌍 Environment Related Presentation Topics
  • ⚗️ Inorganic Chemistry Presentation Topics
  • 👨🏻‍🎓 General Presentation Topics
  • 🦚 Hindi Presentation Topics
  • 🪐 Physics Presentation Topics
  • 🧪 Chemistry: Interesting Presentation Topics
  • 🌿 Biology Presentation Topics
  • 🧬 Organic Chemistry Presentation Topics

Dark Web Vs Deep Web Explained: What's The Difference?

T he internet was created in 1983 and has become a key part of our everyday lives. In fact, Statista  reported there were 5.35 billion internet users as of January 2024. But while millions of people spend their time surfing through Google and other search engines, there are deeper and darker parts of the web a lot of people don't know exist.

You've probably heard the terms "deep web" and "dark web" thrown around on the internet, on TV, or maybe even in real life. Though they're often used interchangeably to refer to the more questionable areas of the internet, they're quite different. And understanding these differences is a dive into the true depth of the internet. While the average person usually limits their browsing to the "surface web," these other parts of cyberspace re still teeming with life and have become home to many users, both benign and nefarious. With that said, here's all you need to know about the dark web, the deep web, and their differences.

Read more: Major PC Brands Ranked Worst To Best

What Is The Deep Web?

Right at the top of the internet is the surface web, which is made up of everything you can access with everyday browsers and search engines. But below the surface lies the deep web — the parts of the internet that aren't indexed. So they're right there on the internet, but you can't find them by doing a quick search. While it may come as a surprise to some, about 7,500 terabytes of all the information on the internet is on the deep web, compared to only 19 terabytes on the surface web. 

But how does this work? Let's break it down: Every search engine has bots, known as web crawlers or spiders, that go through the information on the internet and save it in the search engine's index. When you place a search on Google, the search can only look through indexed sites on the internet. However, some websites aren't indexed at all — this is what makes up the deep web. It's basically anything you can't find or access through a Google search. For example, this includes streaming services like Hulu  that require payment and registration, subscription-based news services or academic journals, and private databases of data from sites like PayPal or Cash App. The deep web is also home to the dark web, which is where you'll find a lot of the shady parts of the internet. 

What Is The Dark Web?

At the very bottom of the deep web is a small corner of the internet called the dark web. As threatening as it may seem, it only makes up a tiny portion of the deep web — less than 0.01%. Like the deep web, websites on the dark web aren't indexed, so they can't be opened using regular search engines. What makes the dark web different is that it can't be accessed with your everyday web browser — this anonymous and decentralized portion of the internet requires a special browser.

Dark net sites form a network that is protected by loads of encryption. Browsers like Tor (or The Onion Router) are able to peel back the layers of encryption, kind of like how you peel back the layers of an onion. Once it bypasses these layers, you get access to the content available on the dark web. The dark web is also extremely private as your online traffic goes through various servers all over the world before reaching a destination, and each server only knows the location of the one right before it, not its origin. So in theory, no one, not even the site you're accessing, can see your IP address, track your browsing history, or trace anything you do on the dark web back to you. 

Purpose And Intent

What exactly are the deep and the dark web used for? For starters, they both exist to offer people a higher degree of privacy than what is available on the surface web. The deep web is mostly used to store confidential information that would be vulnerable if easily accessible on the surface web. So unless you have the correct login details for certain information on the deep web, it can't be accessed. This is great because you can have peace of mind knowing that sensitive data — like your medical information or internet banking profile — is safe. `

On the other hand, the dark web takes privacy to a different level, allowing users to browse and communicate completely anonymously. When most people think of the dark web, they often think of illegal activity. Though the level of anonymity on the dark web has made it a haven for criminal dealings, that's not all it's used for. It's also used by regular people looking to browse privately and ensure their information or internet traffic isn't monitored or tracked. They're able to access books or academic journals that may not be available on the surface of the web and make payments using ostensibly anonymous currency like Bitcoin. Another category of dark web users includes journalists and whistleblowers who want to share confidential or sensitive information without fear of censorship or legal action.

How Are They Accessed?

Although you can access the surface web with your default search engine like Google or Bing, getting on the deep or dark web is a bit different. The deep web can be accessed using regular web browsers as long as you have the right credentials to access the information. However, what makes the deep web different from the surface web is that content is not indexed and cannot be found simply using search engines. But that's the fun part: You may not know it, but you already surf the deep web every day. When you log into your email, social media profile, or favorite video or music streaming platform, you're using the deep web. These pages remain exclusive to you as long as your username and password remain confidential.

The dark web, on the other hand, is a totally different ball game. Dark net websites can only be accessed using designated browsers like Tor or I2P, and they're different from those you come across on the surface web. For one, they usually don't have any recognizable URLs, just a string of numbers and letters. Also, all dark net sites end in .onion, not .com or .org. The most popular browser is Tor. Tor anonymizes your online activity by passing your connection through at least three points, making it a lot harder to trace anything back to you. Other dark net browsers like I2P and Freenet work in similar ways.  

Different Content Types

Search engine bots don't crawl and index every piece of information on the internet. There are some sites they are restricted from accessing, and others that just aren't relevant. But just because this information isn't indexed doesn't mean it isn't there. Most information on many websites is contained on the deep web — it is typically made up of private information from the individual pages or profiles of websites on the surface web. These pages are usually password protected and are only open to users with the right login credentials unique to the user they are registered to. 

The content on the deep web includes medical and financial records, content hidden behind paywalls like subscription-based services and academic journals — even your Google search results, or other generated pages that do not already exist on the deep web. On the dark web, you'll find a lot that you ordinarily would on the surface web — blogs, mail services, marketplaces, games, and forums — but these pages are entirely concealed. Many people use the dark web in the same way they use the surface web but probably opt for the former because they value privacy. But most of what goes on the dark web is illegal. Several  sites and marketplaces that are dedicated to the sale of illegal drugs, weapons, passports, credit card details, and even identities. You can even hire services like hacking. It's some really dangerous stuff.

While the deep web is unindexed, it's not necessarily anonymous. A lot of information on the deep web requires you to log in to an account to keep your data secure, but this also makes it traceable to your identity. Websites on the deep web still use cookies and trackers to collect data from users . Your IP address is also still visible when you visit a site on the deep web and your exact location can be pinpointed. So if your goal is to surf the web completely off the radar, the deep web won't be of much help. 

The dark web, however, is almost entirely anonymous. Whenever you visit a website or do pretty much anything on the internet, your IP address can be used to pinpoint your exact location. This is how websites and even the government track your internet activity. But if you use the dark web, your activity and communication are protected by layers of encryption so that when your connection reaches a website or server, it's difficult to find its original source. This anonymity is what makes the dark web a safe haven for hackers and other criminal activities. To maintain this ghost status, transactions on the dark web are usually done via cryptocurrencies like Bitcoin, which are also difficult to trace.

Security Risks

Most of the content on the deep web is mundane, perfectly legal, and not that different from what's on the surface web. But this doesn't mean you're completely safe. You still run the risk of falling victim to email fraud and scams on online shopping sites . There are also many suspicious links and websites that can infect your computer with viruses and malware. You might think you're downloading a song or a book, but the file is actually a trojan horse carrying dangerous software. All this can be easily avoided if you take safety measures when browsing the deep web to ensure your private accounts aren't compromised. Otherwise, your information might end up for sale on the dark web . 

The same can't be said for the dark web. The dark web is largely unregulated, which makes it easy for a lot of illegal online activity to go unchecked. You're exposed to the risk of scams and identity theft. Many sites on the dark web are secret ploys for different types of cyberattacks like malware, ransomware, or phishing exploits to obtain your information for fraudulent activity or sell. Due to how anonymous the dark web is, it's harder to find the source of these exploits. And because there are no concrete laws regulating the dark web, it's much harder to hold anyone accountable.

Read the original article on SlashGear

hooded figure laptop finger up

deep web

Jul 10, 2014

2.47k likes | 5.84k Views

Deep Web. Dark net . . Invisible Web . . Hidden Web . . OmaR AL-SaffaR. IF YOU think you are accessing the whole internet through browsers such as Chrome and Firefox , you are completely Wrong . . . . ! ! ! !. Definition.

Share Presentation

  • deep web sites
  • example url
  • surface browsers
  • name web site

ninon

Presentation Transcript

Deep Web Dark net . . Invisible Web . . Hidden Web . . OmaR AL-SaffaR

IF YOU think you are accessing the whole internet through browsers such as Chrome and Firefox, you are completely Wrong . . . . ! ! ! !

Definition The Deep Web (also called the Deepnet, Invisible Web, or Hidden Web) is World Wide Web content that is not part of the Surface Web, which is indexed by standard search engines. 2000 First lesson this term

These surface browsers only give you access to four per cent of the actual internet. Then there is the hidden mass of content, known as the Deep Web. Iceberg

These surface browsers only give you access to four per cent of the actual internet. Then there is the hidden mass of content, known as the Deep Web.

Other Example . . .

It’s called the “Deep Web” because it is, well, deep and contains all of the websites and data that you cant access through a regular search engine. The Deep Web itself makes up nearly 96 per cent of the worlds internet content and is totally anonymous. In fact, you cant even access the deep web unless you are also anonymous. Which begs the question - how do you enter the deep web?

Levels Web . . . !!?? Level 0 Web - Common Web Everything Level 1 Web - Surface Web Ex: Web Hosting Level 2 Web - Bergie Web Ex: Streams Videos Level 3 Web - Deep Web Ex: Hackers, Celebrity Scandals Level 4 Web - Charter Web Ex: Onion IB, Hidden Wiki, Most of the Black Market Level 5 Web - Marianas Web

How Do You Access It …?? You need to use a special type of anonymous browser, the most commonly used being TOR (The Onion Router). These browsers can find information not available through surface browsers - such as pages not linked to other pages, sites that require registration and limited access content.

It Is Legal…?? After the NSA spying scandal, concerns have been growing over internet privacy. This has led people to flock to TOR for its ability to let users browse the web in true anonymity.

The Deep Web is perfectly legal and even used by law enforcement agencies...

So what about the legality of it all, you ask..? Its completely legal for you to use it and to hide your identity when browsing the internet. However, because of this anonymity the Deep Web has become a haven for all types of dodgy people like drug dealers, weapons dealers and even hit men looking for work.

Criminal activity takes place in the deep.

Related Concepts

TOR The Onion Router (TOR) is an anonymous browsing client, which allows its users to browse the Internet anonymously by separating identification and routing, thus concealing network activity from surveillance. Some websites on the deep Web can only be accessed via the TOR client.

Silk Road The Silk Road is an online black market which can only be accessed via the TOR browsing client. Many sellers on the site specialize in trading illegal drugs for Bitcoins, a peer-to-peer digital currency.

Hidden Wiki The Hidden Wiki is a wiki database that can only be accessed via the TOR browsing client and contains articles and links to other deep Web sites, the Silk Road, assassin markets and child pornography sites.

Bitcoins 1 = Bitcoins = USD= 489.45 A type of currency often used in deep Web black markets is the Bitcoin, a peer-to-peer digital currency that regulates itself according to network software, with no more than 21 million Bitcoins issued in total by 2140. Bitcoins can be purchased and current exchange rates can be viewed on the MT Gox Bitcoin exchange.

Search Interest Interest over time. Web Search. Worldwide, 2004 - present.

Example URL for Deep Web http://Name Web Site.onion/

Thank U4 Lessen

  • More by User

Current Search Technology for Deep Web

Current Search Technology for Deep Web

Current Search Technology for Deep Web. Currently, Google and other search engine technologies only search the surface web, which only accounts for a fraction of the data online. In order to search the Deep Web, we must find a way to index the deeper information on the internet.

431 views • 28 slides

Deep web

Deep web. Jianguo Lu. What is deep web. Also called hidden / invisible web/database In contrast to surface web interconnected by hyperlinks Content is dynamically generated from a search interface by sending queries. The search interface can be HTML form Web service …

709 views • 34 slides

Deep Web/Invisible Web

Deep Web/Invisible Web

Deep Web/Invisible Web. Brigid Hoban. Surface Web. DEEP WEB. Definition. Deep Web: Content on the Web that is not found in most search engine results, because it is stored in a database rather than on HTML pages.

662 views • 6 slides

Deep Web Technologies’ Roadmap

Deep Web Technologies’ Roadmap

Deep Web Technologies’ Roadmap. Abe Lederman, President and CEO Neeraj Khadakkar, Principal Engineer Deep Web Technologies, Inc. BASF, Ludwigshafen September 29, 2011 . Agenda. Coral and Strata directions Explorit 2.0 highlights and enhancements Other cool things we’re working on

315 views • 18 slides

The Invisible or Deep Web

The Invisible or Deep Web

The Invisible or Deep Web. What is it? The "visible web" is what you can find using general web search engines . It's also what you see in almost all subject directories . The "invisible web" is what you cannot find using these types of tools. . Why?.

355 views • 9 slides

The “Deep Web”

The “Deep Web”

ISC 110 Final Project Kaila Ryan - 12/12/2013. The “Deep Web”. What is the “Deep Web”?. Web content which is hidden behind an HTML form, and is generally not able to be indexed by search engines (Madhavan et. All, 2009) . Largely made up of web-connected databases (Wright, 2009) .

449 views • 6 slides

Querying the deep Web

Querying the deep Web

Querying the deep Web. By John Muntunemuine and Martha Kamkuemah Supervisor: Sonia Berman. Outline. Problem being tackled Why its important Related Work Overview of the system Scope Design challenges Main components of project Key success factors Risks Conclusion.

318 views • 17 slides

Deep Web Integration: Querying Structured Data on the Deep Web

Deep Web Integration: Querying Structured Data on the Deep Web

Deep Web Integration: Querying Structured Data on the Deep Web. Fangjiao Jiang. Outline. Background Access Deep Web MetaQuerier Metasearch engine vs. MetaQuerier Related research groups Conclusion … Some suggestions. Background. Part 1.

562 views • 40 slides

Deep Web Crawling and Mining

Deep Web Crawling and Mining

Deep Web Crawling and Mining. Presented by: Group 17 AIA 8803 Course Feb 28, 2008. What ’ s the Problem?. Large Amount of Deep Web Content Refers to World Wide Web content that is not part of the surface Web indexed by search engines (Bergman, 2001)

370 views • 15 slides

Semantic Annotation of Deep Web Resources

Semantic Annotation of Deep Web Resources

Semantic Annotation of Deep Web Resources. Eric Rozell, Tetherless World Constellation. Outline. Introduction Research Question Use Cases Approach Research Plan. Introduction. Deep-Web vs. Surface-Web Deep-web resources of interest Semantic Web Services

232 views • 11 slides

Searching the Deep Web

Searching the Deep Web

Searching the Deep Web. LEMA, February 2011. Deep Web Video. Surface Web: accessible via general-purpose search engines such as Google and Yahoo!. 25%. 1 trillion + Pages. 500 trillion + Pages!!. Deep Web: Not accessible via typical search engines; primarily databases. 75%.

431 views • 22 slides

Large-Scale Deep Web Integration: Exploring and Querying Structured Data on the Deep Web

Large-Scale Deep Web Integration: Exploring and Querying Structured Data on the Deep Web

Tutorial in SIGMOD’06. Large-Scale Deep Web Integration: Exploring and Querying Structured Data on the Deep Web. Kevin C. Chang. Still challenges on the Web? Google is only the start of search (and MSN will not be the end of it). Structured Data--- Prevalent but ignored !.

823 views • 67 slides

Large-Scale Deep Web Integration: Exploring and Querying Structured Data on the Deep Web

Guest Lecture. Large-Scale Deep Web Integration: Exploring and Querying Structured Data on the Deep Web. Zhen Zhang. What you will learn in this lecture. What is deep Web? Why information integration on deep Web? What are integration paradigms? What are technical challenges?

517 views • 39 slides

WEB 431 MART Deep learning/web431martdotcom

WEB 431 MART Deep learning/web431martdotcom

FOR MORE CLASSES VISIT www.web431mart.com WEB 431 Week 1 Individual Assignment XML Document WEB 431 Week 1 DQ 1 WEB 431 Week 1 DQ 2 WEB 431 Week 2 Individual Assignment RSS and XSLT Template WEB 431 Week 2 Team Assignment Web 2.0 Enhancement Project Implementation Plan WEB 431 Week 2 DQ 1 WEB 431 Week 2 DQ 2

288 views • 21 slides

Deep Web Facts And Links

Deep Web Facts And Links

Check out The Great Collection Of Deep Web Links And Deep Web Facts.

166 views • 2 slides

Deep Web Search Engines

Deep Web Search Engines

Deep web content is believed to be about 500 times bigger than normal search content, and it mostly goes unnoticed by regular search engines. When you look at the typical search engine, it performs a generic search.

169 views • 11 slides

What is deep web?

What is deep web?

Deep web is an underground world of the internet. It is also called as dark internet. The dark internet or the deep web links are not indexed by the popular search engines like Google, yahoo, bing etc., To get the basic idea see the infographic. http://deep-weblinks.com/deep-web/

135 views • 10 slides

Deep-Web Crawling and Related Work

Deep-Web Crawling and Related Work

Deep-Web Crawling and Related Work. Matt Honeycutt CSC 6400. Outline. Basic background information Google’s Deep-Web Crawl Web Data Extraction Based on Partial Tree Alignment Bootstrapping Information Extraction from Semi-structured Web Pages

559 views • 53 slides

Deep  Web Crawling

Deep Web Crawling

Mathy Vanhoef. Deep Web Crawling. Co-presentation. Values for generic text boxes. 2. 1. Initial seed keywords are extracted from the form page. A query template with only the generic text box is submitted. 4. 3. Discard keywords not representative for the page ( TF-IDF rank ).

148 views • 8 slides

Interesting Facts Behind deep web

Interesting Facts Behind deep web

latest onion links https://onion.love

92 views • 4 slides

Welcome To Deep Web Guns

Welcome To Deep Web Guns

DEEPWEBGUNS WISHES YOU A WARMTH WELCOME TO OUR WEBSITE. THE SAFEST PLACE TO BUY FIREARMS ONLINE WITHOUT FFL LICENSE. WE PROVIDE RAPID, RELIABE AND DISCRETE SERVICES TO OUR CLIENTS.

85 views • 6 slides

IMAGES

  1. What is the Dark Web? What is the Deep Web? How to Access the Dark Web

    presentation deep web

  2. PPT

    presentation deep web

  3. Deep Web: un vistazo a la Internet profunda, este mapa te ayudara a

    presentation deep web

  4. What is the Deep Web and How to access it

    presentation deep web

  5. Presentation DEEP WEB by Hana Mouakher

    presentation deep web

  6. PPT

    presentation deep web

VIDEO

  1. DEEP WEB TRAILER(colaboración de fans)

  2. UTOK Consilience Conference 2023

  3. Technical Information (Part

  4. Deep Web vs Dark Web: Unveiling the Mystery

  5. AI & Deep Learning Applications in Bioinformatics Symposium, Tuesday Workshop

  6. How to easily access the Deep Web 😳

COMMENTS

  1. Deep web power point presentation

    Deep web power point presentation. Deep web power point presentation - Download as a PDF or view online for free.

  2. The Deep and Dark Web

    Download now. The Deep and Dark Web. 1. The Deep & Dark Web Jyotsna Gorle ThoughtWorks Inc. 2. What is the Deep Web The Surface Web The Deep Web. 3. What is the DARK Web • The covert and illegal activities make the major part of it. It's not the same.

  3. Deep Web

    Deep web power point presentation. Deep web power point presentation albafg55 ...

  4. What is the dark web? How safe is it and how to access it? Your

    The deep web also includes most academic content handled directly by universities. Think of this like searching for a library book using the facilities' own index files - you might have to be ...

  5. PDF The Dark Web

    The deep web i s t he par t of t he i nt er net whi ch i s gener al l y hi dden f r om publ i c vi ew. Unl i ke t he open web, t he deep web i s not accessed vi a t he usual sear ch engi nes. Much of i t i s ver y or di nar y; or gani sat i ons have websi t es t hat can onl y be r ead by ... PowerPoint Presentation Author:

  6. PDF July 22, 2022 The Dark Web: An Overview

    Within the deep web is the dark web, the segment of the deep web that has been intentionally hidden. It refers to internet sites that users generally cannot access without using special software. While the content of these sites may be accessed using this software, publishers of these sites are often concealed. Users access the dark web

  7. Darkweb research: Past, present, and future trends and mapping to

    1. Introduction. The Darkweb or Darknet is an intrinsic part of the deep web but represents the darker and regressive side of the world wide web. Key characteristics of the Darkweb include the inability to search or list them through legal platforms, passwords to gain entry when accessible, and hidden identities of users, network traffic, IP addresses, and data exchanged through them [1].

  8. Deep Web Technology

    The deep Web contains nearly 550 billion individual documents compared to the 1 billion of the surface Web. More than deep Web sites presently exist. DEEP WEB V/S SURFACE WEB (Contd.) 60 of the largest deep-Web sites collectively contain about 750 terabytes of information - sufficient by themselves to exceed the size of the surface Web 40 times.

  9. Deep Web Vs Dark Web PowerPoint and Google Slides Template

    Download this presentation template for MS PowerPoint and Google Slides to highlight the significant differences between Deep Web vs. Dark Web. The high-definition infographics ensure clear visibility on large screens and allow easy customization.

  10. PDF Chapter 1: Understanding the Dark Web

    It is estimated that the Deep Web contains about 102,000 unstruc-tured databases and 348,000 structured databases. In other words, there is a ratio of 3.4 structured data sources for every one (1) unstructured source. Figure 1.1 is the result of a sample of Deep Web databases con-ducted by Bin et al. (2007) (Fig. 1.2). Finally, the Dark Web is ...

  11. 9 Best Deep Web-Themed Templates for PowerPoint & Google Slides

    9 Best Deep Web-Themed Templates. CrystalGraphics creates templates designed to make even average presentations look incredible. Below you'll see thumbnail sized previews of the title slides of a few of our 9 best deep web templates for PowerPoint and Google Slides. The text you'll see in in those slides is just example text.

  12. White Paper: The Deep Web: Surfacing Hidden Value

    The deep Web is the largest growing category of new information on the Internet. Deep Web sites tend to be narrower, with deeper content, than conventional surface sites. Total quality content of the deep Web is 1,000 to 2,000 times greater than that of the surface Web. Deep Web content is highly relevant to every information need, market, and ...

  13. What is the Deep Web and What Will You Find There?

    The deep web is an umbrella term for parts of the internet not fully accessible using standard search engines such as Google, Bing and Yahoo. The contents of the deep web range from pages that were not indexed by search engines, paywalled sites, private databases and the dark web. Every search engine uses bots to crawl the web and add the new ...

  14. Deep web

    2. Deep Web • The deep Web (also called Deepnet, the invisible Web, dark Web or the hidden Web) refers to World Wide Web content that is not part of the surface Web, which is indexed by standard search engines. • The deep Web is about 500 times bigger than the surface. • It was estimated that the deep Web contained approximately 7,500 terabytes of data and 550 billion individual documents.

  15. The Deep Web: PPT Presentation (Download)

    Click Here to Download deep web ppt. Preview of Deep Web ppt. Transcript. 1. The Deep Web PPT. 2. Surface Web: The surface Web is that portion of the World Wide Web that is indexable by conventional search engines. It is also known as the Clearnet, the visible Web, or the indexable Web. Eighty-five percent of Web users use search engines to ...

  16. Dark Web PowerPoint and Google Slides Template

    Deep Web Vs Dark Web. $5.00. Add to Wish List Add to Compare. Product Details. Harness our Dark Web presentation template for MS PowerPoint and Google Slides to describe the concealed or secretive portion of the internet that is intentionally hidden and inaccessible through standard search engines.

  17. Deep Web

    Deep Web - Free download as Powerpoint Presentation (.ppt), PDF File (.pdf), Text File (.txt) or view presentation slides online. A brief description of the "Deep web" and its intricate and mysterious workings.

  18. Free Deep Web PowerPoint Template

    Add Comment. The free Deep Web PowerPoint Template has a black background with a symbolic image of the hacker and algorithms. This simple background makes the template look neat and professional. The template is suitable for presentations about dark web hacking, online criminal activities, frauds on the internet, phishing, terrorism, etc.

  19. Deep web vs. dark web: What's the difference?

    Like everything connected to the online world, both the deep web and the dark web have their pluses and minuses. Pros of the deep web: Enhanced privacy. Secure website connections. Password-and/or link-protected information. Pros of the dark web: No censorship.

  20. The surface web, deep web and dark web

    A. adeptdigital. The web is too big to map or traverse, too decentralised to manage, index or licence, and too dynamic to master. As a result, we rely on services such as Google Search to guide us to the resources we seek. Usually, "asking Google" is done uncritically, with the underlying, untested assumptions that either "Google knows ...

  21. 88 Best Dark Web-Themed Templates

    Below you'll see thumbnail sized previews of the title slides of a few of our 88 best dark web templates for PowerPoint and Google Slides. The text you'll see in in those slides is just example text. The dark web-related image or video you'll see in the background of each title slide is designed to help you set the stage for your dark web ...

  22. Dark Web Vs Deep Web Explained: What's The Difference?

    At the very bottom of the deep web is a small corner of the internet called the dark web. As threatening as it may seem, it only makes up a tiny portion of the deep web — less than 0.01%.

  23. Deep web Seminar

    DEEP WEB • The Deep Web is World Wide Web content that is not part of the Surface Web, which is indexed by standard search engines. • It is also called the Deep Net, Invisible Web or Hidden Web. • Largest growing category of new information on the Internet. • 400-550 more public information than the Surface Web.

  24. PPT

    It's called the "Deep Web" because it is, well, deep and contains all of the websites and data that you cant access through a regular search engine. The Deep Web itself makes up nearly 96 per cent of the worlds internet content and is totally anonymous. In fact, you cant even access the deep web unless you are also anonymous.