�web
�texts
�movies
�audio
�software
�image
�logo
- ABOUT
- CONTACT
- BLOG
- PROJECTS
- HELP
- DONATE
- TERMS
- JOBS
- VOLUNTEER
- PEOPLE
�search
�upload
�personSIGN IN

Web Crawls

The Web Archive of the Internet Archive started in late 1996, is made available through the Wayback Machine, and some collections are available in bulk to researchers. Many pages are archived by the Internet Archive for other contributors including partners of Archive-IT, and Save Page Now users. Other captures are donated to the Internet Archive by other partners such as Alexa Internet.

�share Share
★favorite Favorite

info About

∑

collection Collection

�

comments Forum

2,606,220
RESULTS

∞
rss

∑collection collections 8,709

�web web 2,566,599

�movies movies 23,842

�data data 5,861

�audio audio 479

�texts texts 358

�software software 357

�image images 14

⍰question other 1

TOPIC �atoz

crawldata 978,784

wiki 656,154

dumps 629,720

incremental 598,348

Wikipedia 239,731

Wiktionary 125,421

Wikibooks 66,008

archiveteam 62,900

Wikiquote 57,099

Wikisource 53,676

Wikimedia 34,782

data dumps 31,351

no404 26,701

wikiteam 26,534

MediaWiki 26,466

Wikinews 24,873

English 23,996

videobot 22,870

live 18,922

livestream 18,918

stream 18,918

unknowncopyright 16,176

Wikivoyage 14,227

wikipedia 13,693

Wikiversity 12,796

wordpress 12,584

tv 8,017

Italian 6,836

French 6,834

Greek 6,834

German 6,832

Spanish 6,832

Swedish 6,798

Portuguese 6,795

Russian 6,795

Portuguese Web Archive 6,583

Portuguese online publications 6,583

Arabic 5,990

Czech 5,990

Japanese 5,989

Finnish 5,987

Korean 5,984

Hebrew 5,981

Ukrainian 5,956

Polish 5,948

Romanian 5,944

Chinese 5,922

Persian 5,867

television 5,408

TV 5,258

Dutch 5,142

Bosnian 5,136

Catalan 5,136

Bulgarian 5,135

Esperanto 5,133

Norwegian 5,110

Serbian 5,106

Tamil 5,105

Turkish 5,105

Vietnamese 5,104

Slovenian 5,102

gizmodo.com 4,511

gawker.com 4,350

deadspin.com 4,346

jalopnik.com 4,334

Hungarian 4,311

Thai 4,293

Welsh 4,282

Azerbaijani 4,281

Croatian 4,281

Lithuanian 4,281

Belarusian 4,280

Estonian 4,280

Limburgish 4,280

Armenian 4,279

Galician 4,279

Marathi 4,279

Malayalam 4,278

Danish 4,277

Indonesian 4,277

Latin 4,277

Icelandic 4,273

Telugu 4,258

Albanian 4,255

Slovak 4,253

Sanskrit 4,252

www.dailymail.co.uk 4,225

WARC 3,943

archive 3,907

snapshot 3,894

Arcmaj3 3,862

Arcmaj3BarrelData 3,740

media 3,496

tape 3,494

Complete crawl of the Portuguese web 3,476

Gujarati 3,458

Kannada 3,458

Breton 3,426

Georgian 3,426

Basque 3,425

Bengali 3,425

Hindi 3,425

Afrikaans 3,424

Kyrgyz 3,424

Macedonian 3,423

Kurdish 3,422

io9.gizmodo.com 3,415

Urdu 3,404

kotaku.com 3,267

Incremental crawl of the Portuguese web 3,107

website 3,030

lifehacker.com 2,995

research 2,901

metro.co.uk 2,897

european 2,875

forum 2,875

parliament 2,875

plenary 2,875

session 2,875

web archive 2,870

discussion forum 2,864

university 2,856

george 2,855

gmu-tv 2,855

gmutv 2,855

mason 2,855

North Korea 2,733

KCTV 2,732

24 2,679

austria 2,679

austria24 2,679

education 2,679

science 2,678

health 2,677

medicine 2,676

humanities 2,674

UCSD 2,673

UCSD-TV 2,673

UCTV 2,673

arts 2,673

public affairs 2,673

public television 2,673

san diego 2,673

satellite 2,673

university of california 2,673

university of california television 2,673

drenthe 2,663

dutch 2,663

nederlands 2,663

rtv 2,663

london 2,654

Chinese (Min Nan) 2,635

Kazakh 2,603

Uzbek 2,597

Sundanese 2,595

Tatar 2,591

bridge 2,590

tower 2,590

FIX 2,583

hungarian 2,583

hungary 2,583

Faroese 2,571

Malagasy 2,571

Western Frisian 2,571

Interlingua 2,570

Khmer 2,569

Malay 2,569

Nepali 2,565

Norwegian Nynorsk 2,561

Occitan 2,560

Wolof 2,557

Yiddish 2,557

Tajik 2,556

Tagalog 2,555

Punjabi 2,554

Sinhala 2,552

Venetian 2,550

Oriya 2,424

2015 2,135

archivebot 2,082

www.telegraph.co.uk 2,048

Old English 2,033

Interlingue 1,965

Asturian 1,785

Corsican 1,785

Irish 1,784

Luxembourgish 1,783

Nauru 1,783

Kashmiri 1,781

Low German 1,780

Assamese 1,778

Quechua 1,774

Simple English 1,773

Uyghur 1,773

Turkmen 1,772

Amharic 1,756

Aymara 1,750

Guarani 1,750

Latvian 1,750

Lingala 1,750

LANGUAGE

English 102,251

Portuguese 6,792

German 4,329

Dutch 2,965

Korean 2,796

Hungarian 2,730

Spanish 960

Russian 889

French 719

Chinese 248

SORT BY

VIEWS

⏤

TITLE

⏤

DATE ARCHIVED

⏤

CREATOR

Internet Archive Web Crawls

∑

collection

790,449

ITEMS

Alexa Crawls

∑

collection

137,604

ITEMS

Worldwide Web Crawls

∑

collection

421,501

ITEMS

Survey Crawls

∑

collection

63,876

ITEMS

Live Web Proxy Crawls

∑

collection

13,609

ITEMS

Archive-It Digital Collection

∑

collection

207,093

ITEMS

Survey Crawl April 2013

∑

collection

16,282

ITEMS

Focused Crawls

∑

collection

125,541

ITEMS

Custom Crawl Services

∑

collection

46,206

ITEMS

web-group-internal

∑

collection

29,828

ITEMS

Wide Crawl started April 2013

∑

collection

25,005

ITEMS

Wayback Indexes

∑

collection

554

ITEMS

Top Domains

∑

collection

68,309

ITEMS

Archive-It Partners

∑

collection

127,831

ITEMS

Fix Broken Links Web Crawls

∑

collection

45,006

ITEMS

alexa_2007

∑

collection

7,636

ITEMS

Survey Crawl December 2014

∑

collection

11,190

ITEMS

Wide Crawl started June 2014

∑

collection

45,313

ITEMS

Wide Crawl started August 2013

∑

collection

21,909

ITEMS

alexa_2006

∑

collection

6,507

ITEMS

Wide Crawl started January 2012

∑

collection

30,362

ITEMS

Wiki Collections

∑

collection

727,346

ITEMS

Wikipedia Outlinks

∑

collection

12,403

ITEMS

Archive Team

∑

collection

127,680

ITEMS

Wide Crawl started April 2012

∑

collection

39,252

ITEMS

Wide Crawl Number 12 - started March, 14th 2015

∑

collection

49,621

ITEMS

Wikipedia Outbound Links

∑

collection

12,730

ITEMS

Survey Crawl started July 2015

∑

collection

10,137

ITEMS

Survey Crawl May 2014

∑

collection

6,909

ITEMS

Wide Crawl started October 2010

∑

collection

15,839

ITEMS

Wide Crawl Started January 2013

∑

collection

15,138

ITEMS

Wide Crawl started September 2012

∑

collection

22,402

ITEMS

Around The World Crawl

∑

collection

2,150

ITEMS

Wide Crawl started October 2011

∑

collection

10,122

ITEMS

Survey Crawl

∑

collection

12,622

ITEMS

Top News

∑

collection

48,925

ITEMS

ArchiveBot: The Archive Team Crowdsourced Crawler

∑

collection

1,706

ITEMS

.com survey started January 2011

∑

collection

2,535

ITEMS

Wide Crawl started March 2011

∑

collection

8,528

ITEMS

Wide Crawl started February 2014

∑

collection

9,789

ITEMS

Wide Crawl Number 13

∑

collection

46,049

ITEMS

38_crawl

∑

collection

1,387

ITEMS

alexa_web_2009

∑

collection

3,080

ITEMS

alexa_web_2010

∑

collection

2,994

ITEMS

Wordpress Blogs and the Pages They Link To

∑

collection

12,583

ITEMS

Wikipedia Outlinks February 2012

∑

collection

2,951

ITEMS

National Library of Australia Crawls

∑

collection

11,022

ITEMS

Alexa Crawl EG

∑

collection

1,678

ITEMS

Wide Crawl Number 14 started March 2016

∑

collection

34,127

ITEMS

web_iq

∑

collection

2,650

ITEMS

web_wk

∑

collection

9,978

ITEMS

National Library of Spain

∑

collection

6,722

ITEMS

26_crawl

∑

collection

1,466

ITEMS

51_crawl

∑

collection

1,138

ITEMS

52_crawl

∑

collection

2,589

ITEMS

Bibliotheque Nationale de France Domain Crawls

∑

collection

1,652

ITEMS

35_crawl

∑

collection

1,179

ITEMS

Shallow Crawls

∑

collection

1,042

ITEMS

Alexa Crawls DF

∑

collection

248

ITEMS

Alexa Crawl EI

∑

collection

1,408

ITEMS

Wikipedia Outlinks March 2016

∑

collection

10,320

ITEMS

alexa_1999

∑

collection

243

ITEMS

International News Crawls

∑

collection

3,581

ITEMS

Alexa Crawl DX

∑

collection

1,442

ITEMS

29_crawl

∑

collection

1,568

ITEMS

web_el_2008

∑

collection

1,705

ITEMS

Alexa Crawls DO

∑

collection

493

ITEMS

web_mon

∑

collection

3,810

ITEMS

Wikipedia Outlinks May 2011

∑

collection

1,638

ITEMS

Alexa Crawls EA

∑

collection

1,315

ITEMS

Alexa Crawls DY

∑

collection

1,326

ITEMS

Internet Archive Global Events

∑

collection

7,118

ITEMS

20th Century Web

∑

collection

331

ITEMS

Elections Web

∑

collection

1,609

ITEMS

Election Crawl 2012

∑

collection

1,608

ITEMS

MORE RESULTS

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

archive.today webpage capture	Saved from		11 Sep 2016 17:24:01 UTC
	All snapshots	from host archive.org
	Linked from	archiveteam.org » Audit2014 archiveteam.org » Internet Archive archiveteam.org » Internet Archive/Collections en.wikipedia.org » Lists of Internet Archive's collections
Webpage Screenshot
		share download .zip report bug or abuse Buy me a coffee

Subject	Poster	Replies	Date
Site Removal Please	MGMidget1234	0	Jun 9, 2016 8:57am Jun 9, 2016 8:57am
Site Removal Request	4687431212	1	May 1, 2016 8:23pm May 1, 2016 8:23pm
Re: Site Removal Request	4687431212	0	May 1, 2016 10:41pm May 1, 2016 10:41pm
Takedown request	victorlsxiv	0	Apr 24, 2016 7:00am Apr 24, 2016 7:00am
only two hours left of April20 (420): everybody Wayback cannabis homepages	EarthFurst	2	Apr 20, 2016 4:04pm Apr 20, 2016 4:04pm
Re: only two hours left of April20 (420): everybody Wayback cannabis homepages	EarthFurst	0	Apr 20, 2016 3:31pm Apr 20, 2016 3:31pm
Re: only two hours left of April20 (420): everybody Wayback cannabis homepages	EarthFurst	0	Apr 20, 2016 5:04pm Apr 20, 2016 5:04pm
"archived" pages disappearing from Wayback: reference at archive.is	EarthFurst	1	Apr 20, 2016 12:21pm Apr 20, 2016 12:21pm
Re: 'archived' pages disappearing from Wayback: reference at archive.is	Jeff Kaplan	1	Apr 20, 2016 4:08pm Apr 20, 2016 4:08pm
Re: 'archived' pages disappearing from Wayback: reference at archive.is	EarthFurst	1	Apr 22, 2016 2:37am Apr 22, 2016 2:37am
Re: 'archived' pages disappearing from Wayback: reference at archive.is	Jeff Kaplan	0	Apr 22, 2016 10:03am Apr 22, 2016 10:03am
The Wayback Machine Forum is "(closed)", but nothing will stop me from adding this post– BELIEVE IT!	pegzmasta	1	Apr 6, 2016 6:27pm Apr 6, 2016 6:27pm
Re: Original Archive is '(closed)'	PDpolice	1	Apr 6, 2016 6:14pm Apr 6, 2016 6:14pm
Re: Original Archive is '(closed)'	pegzmasta	0	Apr 7, 2016 2:37pm Apr 7, 2016 2:37pm
Multiple Set-Cookie Headers: Wayback	River_Delta_CA_USA	0	Apr 4, 2016 10:17am Apr 4, 2016 10:17am
Hi, Wayback– Problem Solved!	pegzmasta	1	Apr 3, 2016 11:13am Apr 3, 2016 11:13am
This Is Only a Test	Dupenhagen Moonbat	1	Apr 6, 2016 4:46pm Apr 6, 2016 4:46pm
Re: This Is Only a Test	pegzmasta	0	Apr 6, 2016 5:27pm Apr 6, 2016 5:27pm
how to query for all the websites that end in ".com.br"?	LucasMation	1	Mar 31, 2016 6:20am Mar 31, 2016 6:20am
Re: how to query for all the websites that end in '.com.br'?	pegzmasta	1	Apr 1, 2016 10:13am Apr 1, 2016 10:13am
Re: how to query for all the websites that end in '.com.br'?	LucasMation	1	Apr 1, 2016 12:03pm Apr 1, 2016 12:03pm
Re: how to query for all the websites that end in '.com.br'?	pegzmasta	0	Apr 1, 2016 12:19pm Apr 1, 2016 12:19pm
Challenge: Read, Reply, and Correct! [The Internet Archive is tasked with preserving content on the Internet, but will it preserve and fix it's own forums?]	pegzmasta	0	Mar 16, 2016 2:35pm Mar 16, 2016 2:35pm
How long does it take to get a response from info@archive.org?	juwhyonee	1	Feb 26, 2016 10:26am Feb 26, 2016 10:26am
Re: How long does it take to get a response from info@archive.org?	aanon	0	May 3, 2016 5:45am May 3, 2016 5:45am
problem with waybacks of comicbookresources.com homepage after 2013	EarthFurst	0	Feb 18, 2016 1:47am Feb 18, 2016 1:47am
my website is not archiving	jon617	0	Jan 7, 2016 4:11pm Jan 7, 2016 4:11pm
So does excluding via robots actually delete or not?	talkingnewspapers	0	Jan 7, 2016 9:46am Jan 7, 2016 9:46am
Crawl and archive a whole website recursively	maltris	0	Jan 7, 2016 2:26am Jan 7, 2016 2:26am
My Website Is Not Crawled Despite Removing Restrictions From Robots.txt	leodwight	0	Jan 4, 2016 7:56pm Jan 4, 2016 7:56pm
What is the algorithm for deciding when to not crawl a page anymore?	zwol	0	Dec 4, 2015 9:37am Dec 4, 2015 9:37am
End of an era: Imageshack deletes free accounts	Javik	0	Nov 28, 2015 12:55pm Nov 28, 2015 12:55pm
Wayback machine rebuild suggestions	Archive Lover1	1	Oct 23, 2015 8:44am Oct 23, 2015 8:44am
Re: Wayback machine rebuild suggestions	h891322	0	Dec 12, 2015 5:55am Dec 12, 2015 5:55am
Entire website archival	tycio	0	Oct 22, 2015 10:57pm Oct 22, 2015 10:57pm
Late 2007 Archive... Gone?	PeabodySam	0	Oct 9, 2015 5:29pm Oct 9, 2015 5:29pm
How do I retrieve the original form of a page from the Wayback Machine?	zwol	1	Sep 1, 2015 2:17pm Sep 1, 2015 2:17pm
Re: How do I retrieve the original form of a page from the Wayback Machine?	DKL3	2	Sep 1, 2015 2:45pm Sep 1, 2015 2:45pm
Re: How do I retrieve the original form of a page from the Wayback Machine?	zwol	0	Sep 3, 2015 11:47am Sep 3, 2015 11:47am
Re: How do I retrieve the original form of a page from the Wayback Machine?	slowride13	0	Sep 29, 2015 9:25am Sep 29, 2015 9:25am
Cannot see content on website but could see before ?	Izzy15	1	Aug 31, 2015 6:51am Aug 31, 2015 6:51am
Re: Cannot see content on website but could see before ?	slowride13	1	Sep 29, 2015 9:36am Sep 29, 2015 9:36am
Re: Cannot see content on website but could see before ?	Izzy15	0	Sep 29, 2015 2:09pm Sep 29, 2015 2:09pm
Cannot see content on website but could see before ?	Izzy15	0	Aug 31, 2015 6:51am Aug 31, 2015 6:51am
searching url substring	iaw4	0	Aug 27, 2015 8:54am Aug 27, 2015 8:54am

Download & Streaming : Web Crawls : Internet Archive

Featured

Top

Featured

Top

Featured

Top

Featured

Top

Featured

Top

Web Crawls

2,606,220 RESULTS ∞rss

TOPIC �atoz

LANGUAGE

�eye 5.9B

�eye 2.6B

�eye 2.4B

�eye 1.3B

�eye 1.3B

�eye 553M

�eye 477.5M

�eye 467.3M

�eye 398.9M

�eye 398.3M

�eye 359.5M

�eye 344.4M

�eye 312.7M

�eye 311.9M

�eye 304.9M

�eye 302.9M

�eye 277.1M

�eye 233.1M

�eye 223.1M

�eye 217.2M

�eye 214.1M

�eye 206.8M

�eye 203.7M

�eye 201M

�eye 194.9M

�eye 193.4M

�eye 162.2M

�eye 146.9M

�eye 143.1M

�eye 142.3M

�eye 139.2M

�eye 138.9M

�eye 132.7M

�eye 125M

�eye 124.1M

�eye 119.8M

�eye 117.9M

�eye 117.6M

�eye 112.2M

�eye 106.8M

�eye 105.5M

�eye 101.2M

�eye 97M

�eye 94.1M

�eye 89.8M

�eye 88.2M

�eye 88.2M

�eye 86.2M

�eye 81.2M

�eye 81.1M

�eye 80.1M

�eye 75.9M

�eye 65.5M

�eye 63.7M

�eye 59.6M

�eye 55.8M

�eye 50.7M

�eye 48.3M

�eye 47.4M

�eye 46.1M

�eye 45.9M

�eye 44.5M

�eye 44.2M

�eye 43.8M

�eye 43M

2,606,220
RESULTS

∞
rss